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Abstract 

The methodology and developmental history of incremental compilation is 
discussed. The implementation of incremental compilation in the PECAN 

programming environment generator is discussed in detail. The PECAN 

environment generated for Pascal has been modified to support procedure-by- 
procedure compilation, and complete (traditional) compilation. The time efficiency 
of these compilation methods is compared with that of incremental compilation. 
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Chapter 1 
Introduction 

Incremental compilers are designed so that only part of a program under 
development need be recompiled after a change has been made to its source code. 
This can be effected in one of two ways: 

• by choosing a structure of the language and recompiling that whole 
structure whenever part of the structure is edited; or 

• by determining the smallest amount of recompilation required after each 
individual editing change and recompiling only that section of the source 
code. 

Using the first method (generally) involves unnecessary recompilation, but 
determining what source code to recompile is trivial. The second method performs 
no unnecessary recompilation, but requires extra computation to determine what 
source code to recompile. 

The aim of this thesis project is to compare the relative efficiencies of these two 
approaches. To this end, an existing system (the PECAN programming 

environment generator) has been modified so that it allows compilation to be 
performed using either of the two methods of incremental compilation. Several 
example programs were chosen and edited so that comparisons could be made. 

Chapter 2 discusses these two approaches in detail, and examines the difficulties 
caused by a programming language's ability to use names. Factors which affected 
the development of incremental compilers, and their relationship to programming 
environments and syntax-directed editors are discussed. 

Chapter 3 gives examples of a number of incremental systems, and discusses the 
role of attribute grammars in generating programming environments. 

Chapter 4 gives a description of the PECAN programming environment generator. 

Chapter 5 gives a detailed description of the implementation of incremental 
compilation within the PECAN system. 



PECAN takes the second of the two approaches mentioned above; it determines 
the smallest amount of compilation necessary after each change to the source. 
Chapter 6 describes how the PECAN environment for Pascal has been modified to 
allow procedure-by-procedure compilation and complete compilation, in addition to 
its incremental compilation. A benchmark was chosen for comparing these 
methods, and the results of a number of tests are included. 

Conclusions are drawn in Chapter 7. 

Part of the project involved the implementation of a new window for PECAN 
which provides a view of the internal data structure used by PECAN's compilation 
module. That view is described in Appendix A. Listings of the files that provide 
the view are included. 

Details of the modifications made to PECAN's compilation module, with program 
listings, are given in Appendix B. 

Appendix C lists the programs used in the tests described in Chapter 6. 

Appendix D gives a detailed description of Earley's parsing algorithm (the 
algorithm used by PECAN). 



Chapter 2 
Incremental Compilation 



2.1. Definition of Incremental Compilation 

The development of a program can usually be characterized by an extended 
sequence of repeatedly editing and compiling source code. The programmer will 
often recompile a program after having made only a small change to the source 
code. If there is a large amount of source code, and the changes made are 
relatively minor, the compiler will be wasting much time and effort compiling 
source code which has not been changed since the last time that the program was 
compiled. 

It is desirable that the programmer should have the convenience of a recompiled 
version of the program, ready to execute, as soon as possible after a change is 
made to the source code. This is particularly true when the program is being 
debugged and the programmer wants to monitor the effect upon the program's 
behaviour of a small modification. 

A compiler is incremental, if it provides the programmer with a recompiled 
version of the program "by expending an amount of effort which is proportional to 
the size of the change made by the programmer." 1 

2.2. Deciding What to Recompile 

2.2.1. The Recompilable Unit 

Ideally an incremental compiler will recompile as little of the source code as 

possible after each modification. In this thesis, the term recompilable unit will be 

used to describe that structure in a programming language which is recompiled by 
an incremental compiler when a change is made." 



per Earley and Caizergues in [Earley 72]. 



2 
The term minimal separately compilable unit is used in [Reiss 84a], and the term smallest compilation 

unit is used in [Fritzson 83a]. 



Consider the following hypothetical language: a program is composed (inter alia) 
of statements; statements may be composed (inter alia) of expressions; and 
expressions may be composed (inter alia) of integers, which are sequences of digits. 
If the language is defined so that no change to a statement can affect the meaning 
of any part of the program outside that statement, then the statement is chosen as 
the recompilable unit. 

However, if the programmer changes the value of an integer by altering a single 
digit, it may be that the code produced by recompiling the enclosing statement 
differs from the corresponding previously-compiled code only in the manner in 
which it represents that integer. Even though the compiler is incremental, it has 
performed unnecessary recompilation; it could have achieved the desired effect 
merely by replacing the code representing the original integer with code 
representing the modified integer. Alternatively, altering a single digit may 
radically change the code which will be produced for the enclosing expression, and 
possibly the enclosing statement. 

For example, assume that the following is a valid statement in this hypothetical 
language 

IF X < 10 THEN 

GOTO LabeM 
ELSE 

GOTO Label 2 

If the integer constant is changed from a 10 to a 9, the object code generated for 
the entire (modified) statement will differ from the previously-compiled code only in 
its representation of the integer 9. However, if the variable X is changed to the 
integer constant 9, the object code generated to evaluate the new boolean 
expression (9 < 10) will be quite different from the code generated to evaluate the 
old boolean expression (X < 10); no code will be required to look up the value of 
X. Furthermore, if the compiler performs simple code optimization then the object 
code for the entire statement can be replaced by object code to represent the 
statement 

GOTO LabeM 

because the new boolean expression (9 < 10) is tautologous. 

2.2.2. Choosing the Smallest Recompilable Unit 

Incremental compilers can be usefully divided into two classes based upon their 
approach to the problem of deciding what to recompile after each change. Some 
choose a syntactic unit of the language (independent of any particular program) as 
the recompilable unit. This recompilable unit is recompiled whenever a change is 



made within that unit. Others attempt to determine the smallest recompilable 

unit (specific to the change being made) in order to be able to recompile as little 

as possible. These two approaches will be referred to as ec-type and /3-type 
respectively. 

a-type incremental compilers will generally perform unnecessary recompilation 
after each change. 3 /3-type incremental compilers will recompile only what is 
necessary, but incur considerable overheads in time and (usually) space in order to 
determine the smallest recompilable unit. The Magpie system (see §3.3.5) is an 
example of an a-type system. PECAN (see Chapter 4) is an example of a /3-type 
system. 

Balancing the costs of these two approaches is the fundamental question in 
incremental compiler design, and the crux of this thesis project as discussed in 
Chapter 6. 

2.2.3. Problems Caused by Names 

In the example given in §2.2.1, the statement was chosen as the recompilable unit 
on the basis that a change to a statement could not affect the meaning of any 
part of the program outside that statement. Unfortunately, the ability to use 
names in a programming language complicates the task of incremental compilation. 

If the part of the source code that is being modified is a declaration then that 
modification may well affect the meaning of statements throughout the rest of the 
program. Statements within the scope of the declaration will need to be checked 
to ensure that the modification to the declaration has not invalidated references to 
the declared name. If the part of the source code that is being modified is a 
statement which refers to a name then the validity and meaning of that reference 
is dependent upon declarations and references elsewhere in the program. 

The manner in which various incremental systems have dealt with this problem is 
discussed in Chapters 3 and 5. The recompilable unit remains (as defined above) 
that structure which will be recompiled. However, it is important to remember 
that further checking may be necessary. 



3 
Note that a normal compiler jie. a "non-incremental" compiler) can be thought of as an a-type 

compiler with the entire program or (as in the case of Modula-2 or C) a component module as its 

recompilable unit. 



2.3. Development of Incremental Systems 

2.3.1. Programming Environments 

The idea of building a compiler which compiles incrementally was mooted as long 
ago as the late 1960s [Braden 68, Katzan 69, Peccoud 69, Rishel 70]. Even so, 
relatively sophisticated incremental compilers were not implemented until the (fairly 
recent) development of programming environments. Programming environments use 
copious amounts of computer resources and it is only with the advent of powerful, 
single-user computers that the implementation of programming environments has 
become feasible. 

A programming environment provides the user (the programmer) with a number 

of integrated, interactive tools so that she/he may create, modify, execute and 

debug a program. 4 If the environment is to be highly interactive then the 

programmer must be regularly informed of errors in the program and given the 
opportunity to correct them. In order for program development to be practicable, 

the compiler must have a fast response time. To ensure a fast response, the 
compilation should be done incrementally. 

The environment should provide more than just a suite of tools which share a 
common database of information about the program. The various tools should be 
presented to the programmer as a single tool; there should be no "fire walls" 
separating the various functions of the environment. The programmer should be 
able to develop programs within the environment without having to "perform 
mental context switches" [Delisle 84]. 

This amalgamation can be achieved by linking the compiler so the editor (as 
described in §2.3.2), and by allowing debugging commands to be entered using the 
language which is being supported by the environment. 5 This latter step obviates 
the need for a programmer to learn a series of special debugging commands, and 
makes it easier for the programmer to view the environment as a single paradigm. 



Cedar [Teitelman 84, Swinehart 86] is an example of a complete environment; as well as providing a 
programming environment, facilities exist for document processing, electronic mail and graphics image 
editing. 

c 

For example, the lnterlisp system [Teitelman 81] provides a single command language for 
programming, debugging and editing. 

The authors of [Delisle 84] make the point that, in such a system, "The debugging mechanisms 
inherently follow not only the notation and semantics of the programming language, but also its 
philosophy." 



Debugging commands entered in the supported language can be (incrementally) 
compiled and executed. However, this approach may prove to be disadvantageous 
in some cases. If the programming language which is supported by the 
environment is highly-readable but verbose, it will be difficult for the programmer 
to construct concise debugging commands. The disadvantage of having a verbose 
debugging language must be balanced against the advantage of allowing the 
programmer to view the environment as a single paradigm. 

2.3.2. Syntax-Directed Editors 

A syntax-directed editor (or SDE) allows the programmer to edit the program 
within the context of the language in which that program is being written. 
Programs are stored internally not as a list of characters but as a parse tree. The 
program is edited in terms of that parse tree, rather than in terms of the textual 
representation of the program. This means that the operation of the SDE can be 
strongly linked with that of an incremental compiler, which is one reason why 
programming environments usually employ SDEs. 

An SDE can be generated from the specifications of a programming language. 7 It 
is often expedient to modify that specification so that commonly-used constructs 
can be created in the SDE without having to move through an inordinately large 
number of levels. 8 Conversely, it is often useful to modify the language 
specification by adding new levels of structure to save the programmer from being 
offered a surfeit of choice at each level. 

SDEs provide the programmer with two types of command: generic tree 
manipulation (e.g. deleting a sub-tree from the parse tree; traversing a sub-tree), 
and language specific commands (e.g. creating a specific statement). Cursor 
movement can be structural or textual. Structural movement is constrained by the 
structure of the parse tree that represents the program. Although such movement 
is often sufficient, it can be frustrating for the programmer if the destination is 
"virtually close but structurally far away" [Garlan 84]. For this reason, most 
SDEs allow both structural and textual movement. 9 



7 
The Cornell Synthesizer Generator [Reps 84] and the PSG system (see §3.3.6) use attribute grammars 

(see §3.4) to generate syntax-directed editors for arbitrary languages. 

Q 

Examples of this are given in [Garlan 84]. 

9 
Textual movement is often implemented using a pointing device {e.g. a mouse). 



2.3.2.1. Advantages and Disadvantages 

SDEs simplify the programmers editing task in a number of ways. Keywords can 
be specified in an abbreviated form. The SDE will be able to determine which 
keyword is desired from the syntactical context of the cursor position. 
Alternatively, a list of those keywords which could validly appear at the current 
cursor position can be displayed (as a menu) and the desired keyword chosen using 
some pointing device. This feature can help a programmer to learn the rules of 
the language. 

SDEs make large demands upon computer resources, especially on space required 
to store the program as a parse tree. However, the main disadvantage of SDEs 
arises from their insistence that the program be consistently correct before and 
after each editing change. The shortest or most natural sequence of editing 
commands which change a legal program P into a legal program P„ may take the 
source code through a series of invalid programs. If all errors are flagged as they 
are detected, the programmer is left to distinguish between substantial errors in the 
program and those transitional errors caused by the editing changes. 

One solution to this problem would be to allow the programmer to effectively 
turn off the error checking mechanism, and to turn it back on when she/he 
believes that the code is valid again. This approach makes the programming 
environment less interactive. Some programming environments solve the problem 
by not allowing the programmer to move the cursor past the first error detected in 
the rode. 10 In this manner the validity of all of the code above the cursor can be 
guaranteed, although the programmer may be forced to follow a convoluted path of 
editing commands to change the program. 11 

Another solution is to use templates. This means that the SDE can maintain a 
syntactically valid program, even though some of the constructs may be shells, 
from which details are missing. 



e.g. the system discussed in [Morris 81]. 

In such an environment, the only error which need be flagged is the first; subsequent errors will be 
flagged when the first is corrected. This may seem an inappropriate manner in which to display errors. 
However, it must be remembered that the first compilers which gave as many error messages as possible 
were developed at a time when compilers were run in batch queues, and system resources were scarce. 
Programmers required as many error messages as possible from each attempted compilation. Such 
considerations are not relevant to the question of when to flag error messages in an interactive, 
incremental programming environment. 



9 



A further difficulty with using SDEs is that the programmer has to adapt 
herself/ himself to entering expressions in a prefix manner. The developers of the 
GNOME programming environment claim that those students using GNOME who 
had programming experience found this awkward at first, while those who had no 
previous programming experience found it easy [Garlan 84]. 12 

2.3.2.2. Triggering Recompilation 

Given that the aim of an incremental compiler is to update the object code after 
each change to the program, it follows that recompilation should be triggered by 
the SDE. It is important to decide exactly what constitutes an editing change. 

The SDE will allow the programmer to indicate, in some way, that a change has 
been made and can now be processed (e.g. by typing the RETURN key). A /3-type 
incremental compiler will proceed immediately to find the smallest recompilable 
unit in order to recompile that. Such a prompt response may be premature if the 
compiler is a-type. It may be that the programmer wants to make two or more 
changes within the same recompilable unit. The changes are reflected immediately 
in the SDE's parse tree, but the a-type incremental compiler may be triggered by 
the SDE only after the programmer has finished making changes within that 
recompilable unit. This may be when the SDE cursor is moved out of the 
recompilable unit, or when the programmer chooses a compile option. 

Implementing such a system requires that a distinction be drawn between the two 
main tasks of a compiler: 

• syntactic checking - ensuring that the program (or program fragment) is 
syntactically correct; and 

• translation - converting the program (or program fragment) into an 
executable form. 

The syntactic checking is performed by the SDE when it constructs its parse tree. 
It is the translation phase of compilation which is triggered after the recompilable 
unit has been edited. 

The use of SDEs makes it difficult to postpone syntactic error checking (as 
discussed in §2.3.2.1) unless it is possible to store syntactically incorrect code in 
the parse tree (flagged in some way so as to indicate that the code contains 
syntax errors). Static semantic error checking can easily be postponed until 
translation. 



12 

"See also Chandhok 851. 



10 



There is a sense in which this approach departs from the ideal of incremental 
compilation. After all, the compiler is no longer providing a compiled version after 
each editing change to the source code. 1;! However, such a system remains 
incremental insofar as it does not require complete recompilation after modifications 
have been made to a program. It also has the advantage of delaying error 
checking, effectively turning error checking off until the recompilable unit has been 
edited. 

This approach is adopted in the MAGPIE system (see §3.3.5) and forms the basis 
of the modifications made to the PECAN system as part of this thesis project (as 
described in Chapter 6). 



13 

Unless one takes the somewhat tenuous view that several editing changes within the one recompilable 

unit constitute a single editing change. 



1] 



Chapter 3 
Examples of Incremental Systems 

3.1. Early Incremental Systems 

3.1.1. Incremental BASIC - 1968 

An implementation of an incremental system for the BASIC language is described 
in [Braden 68]. This system uses a-type incremental compilation. As each line of 
code is entered, it is compiled into machine code and a reference to that code is 
stored in a program vector. When a line is modified it is recompiled. Most 
statements are executed in machine code, but statement-to-statement code is 
handled interpretively, by moving through the program vector. 

There are difficulties in implementing such a system even for a language as 
context-independent as BASIC. For example, if the user enters the following lines 

100 DIM X(10) 
200 LET X(1)=0 
100 DIM X(10,10) 

the assignment statement in line 200 was valid when first entered but, due to the 
change in the definition of the A^ array, it has become invalid. Yet, the system 
will not recompile the offending line because it was valid when first entered. If 
the compiler was forced to compile the entire source file in order to rectify this 
problem then any time saved due to incremental compilation would be lost. One 
solution would be to treat a reference to an element of a one-dimensional array as 
a special case of a reference to an element of a two-dimensional array. This would 
mean that the code generated when line 200 is first entered will still work 
correctly after the X array is redefined. The authors of [Braden 68] give this 
solution serious consideration, rejecting it only because it is not sufficiently general 
to handle all such problems. 

The only remaining solution is to recompile only the statement that was changed 
and check references to the X array for validity at run-time. This solution moves 



i.e. branching statements (GOTO, GOSUB). 



12 



the implementation a little away from the ideal of an incremental compiler because 
the context-sensitive checking is being deferred from compile-time to run-time. But 
the authors justify using this solution on the grounds that it is preferable to the 
other options and that the system is intended for use by students who will usually 
write small programs that are run correctly only once. 

3.1.2. Languages with Nested Statements - 1972 

Earley and Caizergues describe another a-type incremental compilation system 
in [Earley 72]. The authors make the point that it is a relatively easy task to 
incrementally compile programs which have been written in a language which does 
not allow nested statements. In such a language the meaning of each statement is 
usually independent of those statements around it, so it is necessary to recompile 
only the lines that are actually altered. If a declaration is changed, the 
recompilation can be limited to those statements within the scope of the 
declaration. However, if the language allows nested statements then the question 
of statement independence can be greatly complicated. 

The authors' solution to this problem is to distinguish between simple and nested 
statements. The language is redefined so that single statements may only appear 
on a single line, while nested statements may appear on several lines. Skeleton 
entries are maintained for each line of code. These entries link the source line 
with the corresponding compiled code and each includes a pointer to the next line's 
skeleton entry. If the line is the beginning of a nested statement, a pointer in the 
skeleton entry refers to the entry for the line which ends the nested statement. If 
part of a nested statement is modified, only the body of that nested statement 
need be recompiled. Although the authors see the structure as a list of 
statements, the skeleton entries could just as easily have been thought of as nodes 
of a tree. 

The authors identify a problem with this method where the language being 
implemented does not have an explicit end for each nested statement. However, it 
would seem that such languages could be implemented simply by defining an end 
(with a null production) for each nested statement. 

The appropriate lines are recompiled only when all of the editing is complete. 
This delay is for two reasons: it avoids duplicating recompilation, and it doesn't 
force the user to keep the source code syntactically correct at all times. 



2 
Indeed il is difficult to see why a tree structure was not used; it would seem to be a preferable 

paradigm. 
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3.2. Conversational Systems 

Conversational systems were precursors of the more sophisticated incremental 
compilers. A conversational system can be distinguished from a system which 
incorporates incremental compilation by the fact that, although it aims to provide 
a high level of interactivity, it still compiles all of the source code when changes 
are made. 

3.2.1. CONA and COPAS - 1978 and 1981 

The CONA and COPAS systems [Atkinson 78, Atkinson 81a] are implementations 
of conversational Algol and conversational Pascal respectively. The program's 
source code is converted into an intermediate form which can be efficiently 
interpreted. When changes are made to the program, the entire program (that is 
the intermediate representation and the new text) is converted into the 
intermediate form. Modifications to the code are checked for validity immediately. 
If the source contains an error, the compiler halts and waits until the error is 
corrected before the rest of the text is scanned. 

Neither of these systems is significantly faster than a system which has a 
separate text editor and compiler, but the designers point out that the 
conversational systems were designed for use by novices who write small programs. 
For small programs this method compiles code quickly enough, and both systems 
do provide the user with recompiled code after each modification. 

3.3. Incremental Systems in Programming Environments 

3.3.1. The Cornell Program Synthesizer - 1978 

The Cornell Program Synthesizer [Teitelbaum 81] was the first major 
programming environment to treat programs as "a, hierarchical composition of 
syntactic objects, rather than (as) a sequence of characters." The Synthesizer 
supports the development of programs in PL/CS (a dialect of PL/I). Programs 
are edited using an SDE. Templates are used for all but the lowest level language 
structures (or phrases) which are entered as a character string and parsed. 
Phrases are checked for syntactic and semantic errors. Compilation (into an 
interpretable form) is performed each time a template or phrase is inserted. 

Incomplete programs may be executed. Execution halts when an unfilled 
template is encountered, but can be resumed after editing changes have been made 
(unless a declaration is altered). If a change is made to a declaration, all of the 
phrases within the scope of that declaration are re-checked. 
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The Synthesizer has been generalized with the development of the Synthesizer 
Generator [Reps 84] which generates SDEs from languages specified using attribute 
grammars. 

3.3.2. Smalltalk-80 - 1980 

Smalltalk-80 [Goldberg 83, Goldberg 84] is an interactive, integrated programming 
environment. Smalltalk-80 is also an object-oriented programming language 
supported by the Smalltalk-80 environment. The environment is defined in terms 
of the language so the programmer is presented with a single paradigm. 

The basic element in the Smalltalk-80 language is the object, which has its own 
data (not accessible by other objects) and methods. Methods are programs which 
respond to messages passed between objects. Programming in Smalltalk-80 is a 
matter of creating objects and specifying how those objects will communicate with 
each other. Methods are edited using a simple text editor. Smalltalk-80 uses 
a-type incremental compilation, using the method as the recompilable unit. 
Methods are translated into sequences of instructions for a stack-oriented 
interpreter. 

3.3.3. IPE - 1981 

The IPE (Incremental Programming Environment) system is described 
in [Medina-Mora 81]. IPE supports the development of programs in the language 
GC (a variant of the language C, with module structure and type checking). 
Programs are edited using a SDE which is completely template-driven: textual 
input is not supported. The editor ensures syntactic correctness and performs 
semantic checking. 

IPE uses an a-type incremental compilation strategy. Only when a procedure is 
semantically correct, is code produced. The procedure is automatically compiled, 
loaded and linked into the existing executable code for the program. If a 
subsequent change outside the procedure (e.g. to the declaration of another 
procedure) makes an already compiled procedure semantically incorrect, that 
procedure code is replaced by a code stub. If executing the program causes that 
code stub to be executed (i.e. if the semantically incorrect procedure is invoked) 
then execution halts so that the procedure may be modified. 

IPE was designed "to provide the comfort of a flexible and interactive 
programming environment for compiler-based languages." To this end it maintains 
two internal representations of the program under development: the tree 
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representation and the executable representation. The executable representation is 
generated from the tree representation, and may be generated so that it can be 
executed on a different system from that on which the IPE system is being run. 

3.3.4. PECAN - 1984 

The PECAN programming environment generator is discussed, in considerable 
detail, in Chapter 4. 

3.3.5. Magpie - 1984 

The Magpie programming environment supports the development of Pascal 
programs on an experimental workstation. The system's method of incremental 
compilation is described in [Schwartz 84]. Magpie uses a sophisticated a-type 
compilation technique. 

Magpie divides Pascal programs into fragments: statement bodies, variable 
declarations, constant definitions, type definitions, label declarations and headings 
(of procedures, functions and the main program). The text of these fragments is 
stored as a sequence of tokens. Use of an uninterpreted token (representing an 
incomplete token, an incorrect token or un-scanned text) means that all of the text 
can be tokenized at any time. 

Magpie breaks the compilation process into three distinct phases: scanning, 
parsing and recompilation (translation into machine code). Each of these phases 
has its own unit of incrementality. Scanning will respond to a changed character, 
but the parser will not respond to that change unless it means a change to a 
token. For example, changing the value of an integer constant means only a small 
change to the appropriate token. However, if the change to the text changes the 
type of the token (say, from an integer constant to a real constant) then the 
parser is invoked. 

Any single change to the source code is bounded by a single fragment, not by 
the entire text, so the parser can confine itself to that fragment. Each fragment is 
edited separately, and has its own cursor. Magpie uses a textual editor. This 
precludes static semantic checking beyond the first syntax error within each 
fragment. The syntactic structure of each fragment is maintained as a sequence of 
partial parse trees. 

Recompilation is performed on a procedure-by-procedure basis, and is triggered 
when a cursor leaves a fragment. Recompilation of a procedure is performed in 
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the background when the processor is not busy providing the programmer with 
interactive response. J If execution commences before all of the compilation has 
been finished then Magpie executes the existing code, pausing to generate code for 
uncompiled procedures that are invoked during program execution. 

Magpie uses Pascal as a debugging language. The programmer is able to invoke 
code in a given activation record, and to define demons (procedures that can be 
set up so that they are invoked whenever reference is made to a specified 
identifier). These demons can be disabled, although the "hook" into the compiled 
code remains. 

3.3.6. PSG - 1986 

The PSG programming system generator is described in [Bahlke 86]. It produces 
programming environments for a language given a definition of the language 
specified using an attribute grammar (see §3.4). 

The language definition is divided into three parts: 

• syntax 

• context conditions (scope and visibility rules, data attribute grammar, 
basic context relations) 

• dynamic semantics (domain definitions, auxiliary functions, meaning of 
executable parts of program, meaning functions). 

The syntax of the language is mandatory. If the context, conditions are not 

specified then the editor which is generated will be context-free. If the dynamic 

semantics are not defined then the environment which is generated will have no 
means of compiling programs written in that language. 

The editor that is generated allows both structure editing and text editing. 
Where structure editing is used, the programmer is only given menu options which 
are syntactically and semantically valid. Hence the editor can guarantee the 
prevention of syntax errors and semantic errors. When textual editing is used, 
such errors will be recognized immediately and flagged, but not prevented. 

Programs are interpreted using the dynamic semantics information provided. 
Incomplete programs can be interpreted until an attempt is made to interpret a 
syntactically incomplete structure. The PSG system has been used to produce 



3 
During the programmer's "think time" (sic) [DeHsle 84]. 
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environments for Pascal, Algol-60, Modula-2 and for its own formal language 
definition language. 

3.4. Attribute Grammars and Environment Generators 

An attribute grammar 4 is a context-free grammar which has been augmented with 
information which specifies context-dependent aspects of the language. Trees 
generated from attribute grammars are called attributed structure trees. Each node 
of a structure tree has an associated attribute which describes properties of that 
node. 

Attribute grammars have been used in parser generating systems 5 and to generate 
SDEs. 6 As explained in §3.3.6, the PSG system can generate an entire 
programming environment for a language specified using an attribute grammar. 
However, there are several drawbacks associated with using attribute grammars in 
generator systems. 

Specifying a language using an attribute grammar requires that a substantia] 
number of functions be specifically designed for that specification. These functions 
provide the language's semantics, and the attribute grammar provides the 
dependency information used when finding the smallest recompilable unit. This 
dichotomy between semantics and dependency information adds to the complexity 
of a language specification. Language specification in PECAN (see §4.1.2) uses a 
specification language which provides dependency information and (almost) all the 
semantic information without recourse to additional functions. 

If a language specification is based upon an attribute grammar, the symbol table 
is usually represented by a set of state variables at each node of the structure tree. 
This has the inherent disadvantage that a large part of the program has to be 
recompiled whenever a change is made to a declaration. PECAN avoids this 
problem by determining exactly what references are affected by a change to a 
declaration, and processing only those references. 



For a comprehensive discussion of attribute grammars see Chapter 8 of [Waite 84] 
5 e.g. GAG [Hastens 82]. 
e.g. (as already mentioned) the Cornell Synthesizer Generator [Reps 84]. 



18 



Chapter 4 
The PECAN Programming Environment Generator 

4.1. Introduction 

The PECAN programming environment generator was developed at Brown 
University, Providence, U.S.A., under the direction of Steven Reiss. It is a large 
collection of large modules written in the C programming language and executable 
under the UNIX operating system. PECAN was initially designed to run on 
Apollo workstations, but has been adapted for use on Sun workstations. 1 

4.1.1. Documentation 

The PECAN system is very poorly documented. Although a user guide 
exists [Barlow 86a], there is little information available about the internal workings 
and structure of PECAN. Apart from a few papers on PECAN's component 
modules, the main sources of information are [Reiss 83, Reiss 84a, Reiss 84b]. 

Various aspects of the system are discussed in [Barlow 86b, Leung 86, Nearhos 
86, Purdue 86]. This relative dearth of information about the PECAN system 
leaves anyone interested in its workings with no choice but to examine the code. 
Unfortunately, the internal documentation is terse, bordering on the Trappist. 

4.1.2. Language Specification 

PECAN is a programming environment generator. A language's syntax and 
semantics are specified in PECAN's own high-level specification language. ' 
PECAN produces language-specific code from the specifications, which is merged 
with existing language-independent modules to form code which provides the 
programming environment. 



The project that is the subject of this thesis was developed using PECAN on a Sun-2 workstation at 
the Computer Science Department, Australian National University. 

2 
PECAN does not use attribute grammars to specify languages for the reasons given in §3.4. 

3 
The specification of the Pascal WHILE statement is given in Figure 5-2. 



The specification of a language is broken into four parts: 

• an abstract syntax of the language and the semantics of each construct 
in the language; 

• the properties of its symbols: 

• a definition of the types allowed in the language, and details of type 
coercions for resolving expressions; and 

• details of how to build and resolve expressions. 

Theoretically, PECAN can generate an environment for any language that is 
algorithmic, block-structured and makes no explicit use of parallel processing. 
However, an extended version of Pascal (based on [Jensen 78]) is the only 
sophisticated programming language for which a reasonable environment has been 
generated. An environment for the mini-language Core (as defined in [Ledgard 
81]) has been generated, but the language Modula-2 [Wirth 83] proved too 
complicated for one honours student in 1986 [Leung 86]. The specification for 
Pascal is some 4000 lines, and a language as simple as Core required about 1200 
lines to be specified for PECAN. It can be seen that the specification of a 
language for PECAN is a complicated task. 

So, although PECAN is an environment generator, the only practical and useful 
environment which has been generated is that for Pascal. Future references to 
"PECAN" in this thesis will be references to the environment generated by the 
PECAN programming environment generator for the language Pascal. 
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4.1.3. Views 

PECAN makes good use of the graphical capacity of the Apollo and Sun 
workstations, providing the programmer with many views of the program under 
development; multiple views of the shared data structures of PECAN's various 
component modules. These views can be divided into five categories: 4 

• Program Views 

o Syntax-Directed Editor (SDE module - see §4.1.3.1) 
o Nassi-Schneiderman View (NASSI module) 
o Declaration View (DECL module) 
o Box Editor 
o Rothon Editor 

• Semantic Views (static semantic meaning) 

o Symbol Table View (SYMMOD module) 

o Data Type View (TYPE module) 

o Expression View (EXPR module) 

o Flow View (FLOW module - see §4.1.3.2) 

• Execution Views (dynamic semantic behaviour) 

o Interpreter View (PALM module) 
o Stack View (STACK module) 

• System Views 

o Transcript View (CMD module) 

• Miscellaneous Views 

o Draw Window 
o Clock Window 
o Button Window 
o Pics W'indow 



4 



Roughly corresponding to the division in [R.eiss 84b]. 
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All views provide up-to-date information on the state of the program or of its 
execution. When changes are made in one view, that change is reflected 
immediately in all other appropriate views. For example, if a change is made in 
the SDE then that change is immediately reflected in the other program views. 
The various semantic views will reflect the change if it is relevant (e.g. if a change 
is made to a statement, that change is reflected in the flow view; if a change is 
made to an expression, that change is reflected in the expression view). 

An example PECAN screen is given in Figure 4-1. The screen shows several 
views of a program 5 which was in the process of calculating the value of 7!, before 
execution was halted. The views shown are (clockwise from the top left) the 
syntax-directed editor, the symbol table view, the clock window, the flow view, the 
stack view, the expression view, the transcript view, and the interpreter view. 

4.1.3.1. The Syntax-Directed Editor 

Program views provide the programmer with a visual representation of the 
abstract syntax tree (discussed in §4.2.2). The SDE allows both structural and 
textual cursor movement. Furthermore, the programmer may move the cursor 
directly to any part of the program using the pointing device. The programmer 
may use templates to build a program using menus to choose keywords and 
constructs. Alternatively, text may be entered and will be parsed (one line at a 
time). 6 All errors are flagged when detected. 

4.1.3.2. The Flow View 

Th< flow view represents the program in flow chart form. Flow charts are 
constructed using a differently-shaped box to represent each of the following 
structures: the start; a variable declaration; a statement; a condition; an entry or 
exit point into a procedure or function; a junction of paths; and the end. 

The flow view's cursor responds to changes in other views, and if a node in the 
flow graph is chosen (i.e. pointed to) then other program views will reflect the 
change. This is the extent of interactivity allowed in the flow view. 



The test program test^.p (see §C3). 

PECAN uses a parser based upon Earley's parsing algorithm. A detailed description of Earley's 
algorithm is given in Appendix D. 
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Figure 4-1: PECAN Views 
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4.2. Internal Structure 



4.2.1. Modules 



PECAN has a hierarchical module structure. This reflects the fact that PECAN 
was developed to work in an existing environment: the Brown Workstation 
Environment [Bazik 85]. The hierarchy of modules is shown in Figure 4-2. 



Program Views 



Semantic Views 



Execution Views 



SDE NASSI DECL SYMMOD TYPE EXPR FLOW 



STACK 



View Support Environment 



CMD 



PARSE 



PALM 



Incremental Compiler 

SEMCOM 
SYMBOLS EXPRS TYPES FLOWS 



System Support Environment 



PLUM 



ASPEN 



ACER 



Brown Workstation Environment 



ASH 



MAPLE 



SGP 



VT 



WILLOW 



UNIX 



Figure 4-2: Hierarchy of Modules in PECAN 7 



Adapted from a. figure in [Nearhos 86] 
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Several of the modules provide an abstract data type (with its own data 
structure and operations) to the other modules. The module with which this thesis 
is primarily concerned is the SEMCOM module. The operation of SEMCOM is 
discussed in detail in Chapter 5. 

4.2.2. The Abstract Syntax Tree 

The main data structure which is used by all modules is the Abstract Syntax 
Tree (or AST). The AST is supported by the ASPEN module [Molinari 86]. As 
well as maintaining information about the structure of the program, the AST 
provides links to data structures used by other modules. Thus, the AST is the 
central data structure; access to all other data structures can be gained (perhaps 
indirectly) through the AST. 

4.2.3. Events 

In order for PECAN to present the programmer with an integrated environment, 
it is essentia] that the various modules have a means of communicating with each 
other. For example, a change made to the program in the SDE may have effects 
upon all other views. It is clearly undesirable that any one module should have to 
explicitly invoke functions in other modules in order to propagate a change 
throughout the system. As well as being cumbersome to code, such an approach 
makes future expansion of the system very complicated. PECAN solves the 
problem of module communication by use of events. 

An event is effectively an announcement by one module, to any other module 
that might be interested, that some specified happening has occurred. Events are 
broadcast by the PLUM module [Molinari 85]. 8 

The event structure is set up in the following manner. When PECAN is first 
invoked, the main program calls the initialization functions for each module. Each 
module's initialization function registers (with PLUM) the events in which the 
module has an interest. This expression of interest is made using the 

PLUMaccept_event function. PLUMaccept_event takes two arguments: a function 
in the interested module, and the name of the relevant event. Any number of 
modules may register an interest in a given event. 



Note that although events are broadcast, execution is sequential; concurrent execution is not supported. 
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The PLUM module maintains a list of functions registered for each event. When 
a module wishes to trigger an event, the PLUMevent function is used. PLUM 
invokes, in turn, each of the functions linked to that event. Parameters may be 
passed to the PLUMevent function. These parameters are passed to the interested 
functions when an event is propagated throughout the system. 
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Chapter 5 
Incremental Compilation in PECAN 

5.1. Semantic Specification Statements 

The PECAN approach to incremental compilation is described (somewhat 
inaccurately 1 ) in [Reiss 84a]. The SEMCOM module handles incremental 

compilation in PECAN. To achieve this, SEMCOM maintains its own language- 
independent representation of the semantic meaning of the AST - a list of 
statements in a simple semantic language. These statements are referred to as 
semantic specification statements. A brief description of the meaning of each of 
these statements is given in Figure 5-1. 

These statements can be divided into two categories: action statements and 
control statements. When they are executed, action statements build the 

underlying representation of the program. This underlying representation forms the 
data structure used by the flow view to display the program in flow graph form. 
This flow graph representation is directly interpreted when the program is run. 
Control statements specify the order in which the action statements are executed. 
The language uses a stack and a small set of variables called current items. The 
current items are: 

• the current scope; 

• the current referenced object; 

• the current flow graph node; 

• the current type: 

• the current expression; 

• the current auxiliary scope: 

• the last type built; and 

• the current mode. 



See second footnote on page 41. 
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DO 
FOR 

START 
BEGIN 

END 
FIND 

LOOK 

USE 

BUILD 

DEFINE 

SET 
GET 
VALUE 
MODE 

PUSH 

POP 

EXPR 

FLOW 
TYPE 
CLEAR 



Visit a specified sub-tree. 

Visit each of the children of a list-type node. 

Create an INITIAL scope (marks the beginning of the tree walk). 

Create a new scope. 

Close the current scope, and return to the parent scope. 

Find the symbol table name associated with the specified string 
or token. 

Partially resolve a name given specified restrictions. 

Resolve a name to a single object. 

Create a new object of a given type. 

Take a newly created object and associate it with the current 
name. 

Set the current symbol. 

Access the current symbol. 

Determine the value of a constant given its textual representation. 

Set flags that affect the current symbol's storage class, and the 
type of parameter that it may represent (inter alia). 

Push current symbol onto the stack. 

Pop current symbol off the stack. 

Build an expression from the top elements of the stack (using the 
current symbol as an operator, with a specified number of 
operands. 

Attach a new node to the flow graph representation. 

Build a data type. 

Initialize the current items. 



Figure 5-1: Semantic Specification Statements 2 



Adapted from [Reiss 84a] and [Molinari 87a). 
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Semantic specification statements make use of the current items in order to reflect 
the semantics of each construct in the programming language. Information is 
passed between semantic statements via the current items. The main advantage of 
this approach is that it becomes possible to extract dependency information from 
the specification of each construct, in order to determine the smallest recompilable 
unit. 

5.2. Specifying a Construct 

The sequence of semantic specification statements associated with each construct 
in the programming language forms part of the language specification (discussed in 
§4.1.2). The specification of the Pascal WHILE statement is given in Figure 5-2. 
This specification can be thought of as a set of instructions to PECAN as to how 
to "compile" a WHILE statement. 



STATEMENT : := wh i I e_statement ; 

whi le_statement => I F_ EXPRESSION STATEMENT :: 

SOURCE: "WHILE 61 D0©+©R©c©n©2©-" 
COMMENT 

SYNONYM: "While" 
SEMANTICS:} CLEAR; 

BEGIN loop; 

DEFINE NAME=operator,EXIT,CLASS=label ; 

DEFINE NAME=operator,NEXT,CLASS=label ; 

USE NAME=ope ra t o r , NEXT , CURRENT=ONLY ; 

FLOW LABEL=1 , LABEL=REF ; 

DO ©1 ; 

FLOW N0TTEST.2; 

DO ©2; 

FLOW G0T0=1 ; 

USE NAME=operator,EXIT,CURRENT=ONLY; 

FLOW LABEL=2 , LABEL=REF ; 

END; 

i 

SEEDY: "WHILE ©~ Bn X1 ©' WBLOCK ©~ ©n ©2 ©' ©n WEND" 
ROTHON: LOOP ©1 : ©2 : ; 
NS: LOOP ©1 ©2 NONE; 



Figure 5-2: Specification of Pascal WHILE Statement 3 



The string labelled SOURCE is used by the parser, and by the SDE for 
formatting the construct. COMMENT indicates that a comment may be attached 
to the WHILE statement. The SYNONYM is the name of the construct for use 
by the SDE in creating menus for template selection. 



3 
This specification of the WHILE statement is taken from the specification used to generate a PECAN 

environment for Pascal at the Australian National University. It differs slightly from the specification 

given in [Reiss 84a]. 
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ROTHON and NS define the representation of the WHILE statement for the 
Rothon editor and the Nassi-Schneiderman view respectively. SEEDY defines the 
representation for an apparently unimplemented view. 

The statements between the curly brackets labelled SEMANTICS are the semantic 
specification statements for the WHILE statement. The CLEAR statement 
initializes the current items. This states that the WHILE statement is completely 
independent of preceding Pascal statements. The BEGIN statement starts a scope 
of type loop. The two DEFINE statements define an EXIT label and a NEXT 
label in the operator auxiliary table. The USE statement extracts the NEXT label 
for use in the subsequent FLOW statement. The FLOW statement defines two 
labels in the flow graph: NEXT and a temporary label 1. The DO statement 
causes the semantic specification statements associated with the IF_EXPRESSION 
sub-tree to be processed next. The FLOW statement causes a jump to temporary 
label 2, if evaluating the IF.EXPRESSION returns false. The second DO 
statement processes the body of the WHILE statement, and the third FLOW 
statement causes an unconditional branch back to temporary label 1. The USE 
statement and the FLOW statement access, and attach to the flow graph, the 
EXIT label and temporary label 2. The END statement ends the loop scope which 
was begun with the BEGIN statement. 
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5.3. Data Structure 

5.3.1. SEMCOM.STMTs and the Abstract Syntax Tree 

SEMCOM stores its semantic specification statements as a doubly-linked list of 
record structures called SEMCOM„STMTs. 4 Each of these SEMCOM_STMTs 
contains: 

• pointers forwards and backwards to other SEMCOM_STMTs (used to 
maintain the doubly-linked list); 

• details of the type of semantic specification statement being represented; 

• a pointer into the AST (for arguments to the semantic specification 
statement); and 

• the values of the current items. 5 

The semantics of the entire program can be represented by a list of 
SEMCOM_STMTs. Each node of the AST has a pair of pointers which mark the 
beginning and the end of the list of SEMCOM_STMTs which give the semantics of 
the construct at that particular node. This is illustrated in Figure 5-4. 

5.3.2. SEMCOM.STMTs and the Flow Graph Representation 

Consider the Pascal program listed in Figure 5-3. Using the specification of the 
WHILE statement (given in Figure 5-2), PECAN parses the WHILE statement into 
a tree (shown in Figure 5-5). The SEMCOM module produces a list of 
SEMCOM_STMTs which give the semantics of that particular instance of the 
WHILE statement. The list of SEMCOM.STMTs produced for this example 
appears in Figure 5-6. The beginning and the end of each of the sub-lists of the 
list are labelled with the name of the associated node of the tree.' When this list 
of SEMCOM.STMTs is executed, the flow graph representation of the WHILE 
statement is constructed. The flow graph representation for this WHILE statement 
appears in Figure 5-7. 



4 
Note that the mapping from semantic specification statements to SEMCOM.STMTs is not quite one- 
to-one. Each action statement in the semantic specification is mapped into one or more 
SEMCOM.STMTs. Statements like USE and LOOK can imply several actions, and the interpretation of 
statements like SET can depend upon their arguments. 

5 
One of the current items is the current flow node. It is through this pointer that the associated 

(interpretable) flow graph representation of the program is accessed. 

7 
Part of this thesis project involved the development of a new PECAN view which displays the 

SEMCOM.STMTs associated with the current node (as indicated by the cursor in the SDE or some other 

program view). The list in Figure 5-6 was prepared using this semantic actions view. Details of this new 

view are given in Appendix A. The form in which SEMCOM.STMTs are displayed is explained in §A.l. 
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PROGRAM interminable ( i nput , output ) ; 

| no dec I arat i ons j 

BEGIN } Program interminable } 
WHILE true DO 

WRITELNC loop'); 

END. 

Figure 5-3: Small Pascal Program with an Example WHILE statement 6 



5.4. Execution and Unexecution 

When a sequence of SEMCOM_STMTs is executed, 8 a flow graph representation 
is constructed. This flow graph representation is interpreted in order to run the 
program. SEMCOM_STMTs can also be unexecuted. Unexecuting a sequence of 
SEMCOM_STMTs has the effect of removing, from the flow graph representation, 
those constructs which were created when that same sequence of SEMCOM_STMTs 
was executed. 9 

This symmetry of SEMCOM_STMTs - the fact that they can be both executed 
and unexecuted - is essential to PECAN's approach to incremental compilation. 
Ignoring (for the moment) the problems involved in finding the smallest 
recompilable unit, the process of incremental compilation can be thought of in the 
following manner. When a node is changed in the AST, the SEMCOM_STMTs 
associated with the old node are unexecuted. This has the effect of removing, 
from the flow graph, the code corresponding to the node as it was before 
alteration. Next, the SEMCOM_STMTs associated with the new AST node are 
executed. This inserts, into the flow graph, the code corresponding to the new 
node. The flow graph is now, as before, an interpretable representation of the 
program (as amended). 



Note that this program listing was formatted by PECAN, using the formatting information included in 
the specification of Pascal. 

o 

The execution of SEMCOM_STMTs should not be confused with the execution of the program (i.e. the 
interpretation of the flow graph representation). 

9 
The functions that perform execution and unexecution consult and update the values of the current 

items, as discussed in §5.5.2.5. 
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Figure 5-4: Abstract Syntax Tree with Pointers into List of SEMCOM_STMTs 
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Figure 5-5: Parse Tree for Example WHILE Statement 
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WHILE (begin) 

(0x2484f8) 

(0x24dad0) 

(0x24dab4) 

(0x24da98) 

(0x24da60) 

(0x24da7c) 

(0x24da44) 

(0x24da28) 

(0x24d9f0) 

(0x24da0c) 

(0x24dd90) 

(0x24dd74) 

(0x24dd3c) 

(0x24dd58) 

(0x24dd20) 

(0x24dd04) 

(0x24dce8) 

(0x24dccc) 

(0x24dcb0) 
BOOLEAN_EXPRESSION (begin) 

(0x24df1c) 
IDENTIFIER 

(0x24dbec) 

(0x24dc78) 

(0x24dc94) 

(0x24dc5c) 

(0x24dc40) 

(0x24dc24) 

(0x24dc08) 
IDENTIFIER 

(0x24df70) 

(0x24dbd0) 

(0x24df54) 

(0x24df38) 
BOOLEAN_EXPRESSION (end) 

(0x24df00) : 
WRITELN (begin) 

(0x24e218) 

(0x24dee4) 

(0x24deac) 

(0x24dec8) 

(0x24de90) 

(0x24de74) 
OUT_EXPR_S 

(0x24e314) 



SAVE_CUR 
FLOW_ALLOC 519 
CLEAR 521 
BEGIN 522 
CHECK_LEX 
FIND 524 
BUILD 527 
DEFINE 529 
CHECK_LEX 
FIND 531 
BUILD 534 
DEFINE 536 
CHECK_LEX 
FIND 538 
R_CURONLY 541 
USE 542 
FLOW_BLOCK 543 
FLOW_BLOCK 546 
SYMBOL SET FLOW 549 



SAVE_CUR 
(begin) 

SAVE_CUR 
CHECK_LEX 
FIND 1604 
R_CLASS 1843 
R_CURNEST 1853 
USE 1854 
EXPR_REF 1855 



(end) 



CHECK_LEX 
FIND 1327 
USE 1330 
EXPR_BUILD 1331 



FLOW BLOCK 552 



SAVE_CUR 
CLEAR 1123 
CHECK_LEX 
FIND 1124 
EXPR_BUILD 1127 
FLOW_BLOCK 1129 
(begin) 
NOP 
OUTPUT_EXPRESSION (begin) 
(0x24df90) : SAVE_CUR 

(0x24de58) : EXPR_BEGIN 1151 



[0x2486c8] 

[0x248ae0] 

[0x0] 

[0x218964] 

[0x0] 

[0x21 b3e4] 

[0x24929c] 

[0x21 b3e4] 

[0x0] 

[0x249494] 

[0x249260] 

[0x249494] 

[0x0] 

[0x24946c] 

[0x21b5f4] 

[0x24946c] 

[0x248aac] 

[0x248a78] 

[0x24946c] 

[0x2486a0] 

[0x248678] 
[0x218d82] 
[0x249444] 
[0x248dfc] 
[0x21b5e4] 
[0x249444] 
[0x20cad0] 

[0x0] 

[0x24941c] 
[0x24941c] 
[0x20ca98] 

[0x248a44] 

[0x248650] 

[0x0] 

[0x0] 

[0x2493f4] 

[0x20ca60] 

[0x248a10] 

[0x0] 

[0x248628] 
[0x20ca28] 



e0x24a3e8 
e0x24a3e8 
©0x24a3e8 
80x24a3e8 
e0x24a3e8 
80x24a3e8 
60x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 

©0x24a520 

©0x24a554 
©0x24a588 
©0x24a588 
©0x24a554 
©0x24a554 
©0x24a554 
©0x24a554 

©0x24a520 
©0x24a520 
©0x24a520 
©0x24a520 

B0x24a3e8 

@0x24a41c 
©0x24a41c 
©0x24o41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 

©0x24a450 

©0x24a484 
©0x24a484 



Figure 5-6: List of SEMCOM_STMTs for Example WHILE Statement 



(continued next page) 
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STRING (begin) 

(0x24e01c) 

(0x24de20) 

(0x24de3c) 

(0x24de04) 

(0x24dde8) 

(0x24ddcc) 

(0x24ddb0) 

(0x24e134) 

(0x24e150) 

(0x24e0f c) 

(0x24e118) 

(0x24e0e0) 

(0x24e0c4) 

(0x24e0a8) 

(0x24e08c) 

(0x24e070) 

(0x24e054) 

(0x24e038) 
STRING (end) 

(0x24dfe4) 

(0x24e000) 

(0x24dfc8) 

(0x24dfac) 
OUTPUT_EXPRESSION (end) 

(0x24e330) : NOP 

OUT_EXPR_S 

(0x24e2dc) 

(0x24e2f8) 

(0x24e2c0) 

(0x24e2a4) 

(0x24e26c) 

(0x24e288) 

(0x24e250) 

(0x24e234) 
WRITELN (end 

(0x24e1fc) 

(0x24e1c4) 

(0x24e1e0) 

(0x24e1a8) 

(0x24e18c) 

(0x24e170) 

(0x248568) 

(0x24854c) 

(0x248530) 

(0x248514) 
WHILE (end) 



SAVE_CUR [0x248600] 

CHECK_LEX [0x21 8d87] 

FIND 1520 [0x2493cc] 

R_TOPONLY 1523 [0x21 b5d4] 

R_CLASS 1524 [0x248df0] 

USE 1527 [0x2493cc] 

TYPE_BUILD_REF 1528 [0x249678] 

TYPE_END_SET [0x249678] 

TYPE_END 1530 [0x0] 

CHECK_LEX [0x218d87] 

FIND 1531 [0x2493a4] 

BUILD 1534 [0x249b54] 

DEFINE 1536 [0x2493a4] 

EXPR_REF 1538 [0x20c9f0] 

MODE_SET 1539 [0x1201] 
SYMBOL_SET_MODE 1541 [0x2493a4] 

VALUE 1542 [0x2493a4] 
SYMBOL_SET_TYPE_OF 1544 [0x249678] 

CHECK_LEX [0x0] 

FIND 1156 [0x249304] 

EXPR_BUILD 1159 [0x20c910] 

FLOW_BLOCK 1161 [0x2489dc] 



end) 

CHECK_LEX 
FIND 1134 
EXPR_BUILD 1137 
FLOW_BLOCK 1139 
CHECK_LEX 
FIND 1142 
EXPR_BUILD 1145 
FLOW_BLOCK 1147 

) 

FLOW_BLOCK 557 
CHECK_LEX 
FIND 560 
R_CURONLY 563 
USE 564 
FLOW_BLOCK 565 
FLOW_BLOCK 568 
SYMBOL_SET_FLOW 571 
END 572 
FLOW_FREE 573 



[0x0] 

[0x0] 

[0x2492dc] 

[0x20ca28] 

[0x2489a8] 

[0x0] 

[0x249e3c] 

[0x20c8d8] 

[0x248974] 

[0x248940] 

[0x0] 

[0x249e14] 

[0x249c40] 

[0x249e14] 

[0x24890c] 

[0x2488d8] 

[0x249e14] 

[0x218984] 

[0x2488a4] 



©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
S0x24a4ec 
©0x24a4ec 
©0x24o4ec 
©0x24a4ec 
©0x24a4ec 
80x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 
©0x24a4ec 

©0x24a484 
©0x24a484 
©0x24a484 
©0x24a484 

©0x24a450 

©0x24a41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 
©0x24a41c 

©0x24a3e8 
80x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 
©0x24a3e8 



Figure 5-6 continued 
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set out file() 



I 



write ( 'loop' ) 



I 



writeln 



file end 




I 




EXIT 



STOP 



Figure 5-7: Flow Graph Representation of Example WHILE Statement 
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5.5. Incremental Compilation in PECAN 

The system as described thus far would be appropriate for an a-type incremental 
compiler operating in the following manner. If the recompilable unit was taken to 
be a Pascal procedure then every time a node was changed in the AST, the 
SEMCOM_STMTs associated with the procedure in which the change was made 
could be unexecuted (effectively removing the interpretable code for that procedure) 
then the SEMCOM_STMTs representing the modified procedure could be executed 
to restore the flow graph. 

However, PECAN is a /9-type incremental compiler; it determines the smallest 
recompilable unit before incrementally compiling. The algorithm used by PECAN 
is described in §5.5.1. 

5.5.1. General Algorithm 

When a change is made to the AST, SEMCOM creates a list of 
SEMCOlVLSTMTs (the new list) corresponding to the new node. The list of 
SEMCOM_STMTs corresponding to the node as it was before the alteration is 
referred to as the old list. The old list and the new list are compared and the 
area of difference is established. The SEMCOM_STMTs preceding and following 
the area of difference in both lists are disregarded, in order to avoid unnecessary 
recompilation. 

It is not sufficient to simply unexecute the resulting old list then execute the 
corresponding new list. It may well be that the area of difference represents only 
part of a construct. Its semantic validity may depend upon SEMCOM_STMTs 
representing the rest of the construct. For example, consider the Pascal statement 

IF x = y THEN 

<statement> 
ELSE 

<statement>; 

If the identifiers x and y are declared as being of the same type then this will be 
a valid statement. If the identifier x is replaced by the identifier z then the 
validity of the condition depends upon the type of z. Clearly it is not enough to 
simply replace the flow graph code that determines the value of x with similar 
code for z. First, z must be checked for compatibility with y. 

SEMCOM extends the new list to include SEMCOM_STMTs until all of the local 
effects of the change have been covered. The new list is unexecuted back to the 
point where the lists differed. The old list is then unexecuted, before the extended 
new list is executed. An update routine propagates changes throughout the rest of 
the program. 
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5.5.2. Implementation Details 

A detailed description of how SEMCOM implements this algorithm requires an 
understanding of the workings of some of the lower-level SEMCOM functions. 

The operation of the functions headjmerge, tailjmerge, extend, remove and insert 
will be described by reference to the diagrammatic representation of the old and 
new lists which appears in Figure 5-8. The old list is that list between the oldp 
and oendp pointers. The new list is that list between the newp and nendp 
pointers. The SEMCOM_STMTs with shaded bodies are those that form the area 
of difference. 

5.5.2.1. head_merge 

Figure 5-8(a) shows the state of the lists of SEMCOM_STMTs before the 
headjmergc operation is performed. The old list is part of a longer list that 
represents the whole program - the main list. The new list exists separately. 
The headjmerge operation moves the oldp and newp pointers down their respective 
lists until the SEMCOM_STMTs that they refer to are different. As the pointers 
are moved, the new list is merged into the old list, and the duplicate 
SEMCOM_STMTs are removed from the old list. Figure 5-8 (b) shows the state of 
the lists after the headjmerge. 

5.5.2.2. tailjmerge 

The tailjmerge function is complementary to headjmerge. Figure 5-8 (b) shows 
the state of the lists before the tailjmerge operation and Figure 5-8(c) shows the 
state afterwards. The duplicated SEMCOM_STMTs in the new list have been 
merged into the old list, and the corresponding old SEMCOM_STMTs have been 
discarded. 

5.5.2.3. extend 

The extend function moves the nendp pointer (effectively extending the new list) 
until it includes all of the SEMCOM_STMTs required to ensure that all of the 
local effects of the change are completed. As has been explained, the meaning of 
each construct, in the language is given by semantic specification statements in 
terms of the current items. So the local effects of a change to the program will 
be reflected in those current items. 



SEMCOM_STMTs removed from the old list are shown in Figure 5-8(b) with no pointers pointing to 
them. In fact they are removed, one at a time, by head_merge yet they do not disappear from the 
diagrammatic representation until Figure 5-8(c). The discarded SEMCOM_STMTs appear in Figure 5-8(b) 
in order to make the operation of headjmerge clear. The same is true of tail__merge, where the 
SEMCOM_STMTs that are removed do not disappear until Figure 5-8(d). 
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Figure 5-8: Effect of head_merge, tail_merge and extend 

upon the old and new lists 
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Some of the semantic specification statements have a corresponding statement 
which must appear in order for the list of statements to provide a valid 
specification. For example a BEGIN statement (which marks the beginning of a 
new scope) must have an associated END; a PUSH statement must have an 
associated POP. These statements, which must follow certain other statements, 
will be referred to as end bracket statements. There are four end bracket 
statements (END, POP, TYPE and FLOW) which may be required by the 
occurrence of various start bracket statements. 

The extend function proceeds as follows. First the old list is scanned, in order 
to count the number of end bracket statements with no matching begin bracket 
statements in the old list. The new list is then scanned, and extended (if 
necessary) until 

• it contains an unmatched end bracket statement corresponding to each 
such unmatched end bracket statement found in the old list; and 

• each begin bracket statement in the new list has a matching end 

bracket statement. 

Figure 5-8(c) shows the state of the lists before the extend operation, and Figure 
5-8(d) shows their state afterwards. Because the new list has been merged into 
the old list (by head_merge and tail_m.erge, the only limit on how far extend can 
move the nendp pointer is the end of the complete list of SEMCOM_STMTs (i.e. 
the end of the program). 

The extend function also performs the unexecution of the extended part of the 
new list (marked A in Figure 5-8(d)). 1 " 

5.5.2.4. remove and insert 

The remove function unexecutes (in reverse order) each of the SEMCOM_STMTs 
in the old list (marked B in Figure 5-8(d)). Each SEMCOM_STMT is removed 
from the list after unexecution. 

The insert function executes (in order) each of the SEMCOM_STMTs in the 
extended new list (marked C in Figure 5-8(d)). 



Not all of the start bracket statements for TYPE and FLOW have been identified. 

'inexplicably, Reiss makes no mention of this step in his description of PECAN's incremental 

compilation [Reiss 84a]. If this step is not taken, the extended part of the new list will soon be executed 

(by insert) without first having been unexecuted. 
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5.5.2.5. The Current Items and Execution and Unexecution 

The functions _SEMCOM_execute and _SEMCOM_unexecute perform execution 
and unexecution respectively. Before a SEMCOM_STMT can be executed or 
unexecuted it has to be put into context; the values of the current items must be 
established. Before the insert function calls the _SEMCOM_execute function for 
the first time, it calls the _SEMCOM_sei_current function to set the current items 
to the values that they should hold before the first SEMCOM_STMT in the new 
list. _SEMCOM_set_current moves backwards through the list of 

SEMCOM_STMTs preceding the new list, retrieving the values that were most 
recently assigned to each of the current items. Once values for all of the current 
items have been retrieved, execution can commence. Each time a 

SEMCOM_STMT is executed, the current items are updated accordingly. 

Unexecution is handled slightly differently. Every time the _SEMCOM_unexecute. 
function is called (by extend or remove) in order to unexecute a single 
SEMCOM_STMT, the values of the current items are determined. However, the 
_SEMCOM_unexecute function only determines the values of those current items 
which are relied upon in the unexecution of the SEMCOM_STMT in question. 

5.5.2.6. Updating the Semantics 

SEMCOM has four semantic support modules: the symbol table support module, 
the type support module, the expression support module, and the flow graph 
support module. When a SEMCOM_STMT is executed or unexecuted, two stages 
of processing are triggered: 

• the flow graph representation is modified (as explained in §5.4); and 

• information is passed to the relevant support module for processing after 
the execution and unexecution of all the SEMCOM_STMTs is completed. 

In the second case, information is queued to a support module which adds that 
information to a list of operations it must perform when the execution and 
unexecution is finished. 

When a definition of a name is created, modified or removed, all of the references 
to that name are queued with the symbol table support module for later checking. 
When a type reference cannot be immediately resolved (i.e. it relies upon a name 
in the symbol table) then that type is queued with the type support module. 
When an expression is modified it is queued with the expression support module 
for later resolution. When a flow graph operation is required, but cannot be 
performed in the first phase (i.e. it relies upon a name in the symbol table), that 
operation is queued with the flow graph support module. 
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When all of the execution and unexecution has been performed, the 
SEMCOMupdate function is invoked. That function calls each of the support 
modules in turn, requesting that they process the requests stored in their respective 
queues. Each call causes the support module in question to continue resolving 
items from its list until the list is empty. The dependencies between the modules 
are such that running down the list of one module can result in other requests 
being queued in any other support module except the symbol table support module. 
For that reason, the symbol table support module is forced to update first, then 
the other three support modules are called repeatedly until all of the lists are 
empty, at which point all of the effects of the original change have been 
propagated throughout the program. 

5.5.2.7. Driving Routines - The Outer Level of SEMCOM 

The functions described above (§5.5.2.1 to §5.5.2.4) are invoked by the externally 
visible (outer level) SEMCOM routines. 

When SEMCOM is initialized it registers its interest in an event called 
ASPEN_$NODE_CHANGE. This event is triggered by the ASPEN module when a 
node in the AST is changed or deleted. The ASPEN_$NODE_CHANGE event 
passes, as a parameter, a pointer to the modified node in the AST. The event 
causes a call to the sem_evenl_node function, which determines whether the new 
node has been modified or deleted and calls _SEMCOM_replace_list or 
_SEMCOM_remove_list accordingly . 

_SEMCOM_replace_list uses the ASPENinq_semantics function to find the head 
and tail of the list of SEMCOM_STMTs associated with the new node. Although 
the node has been changed, its associated SEMCOM_STMTs are still those of the 
old node (i.e. the old list). 

A new list of SEMCOM_STMTs is generated, representing the semantics of the 
new node. The pointers to the head and tail of this list (the new list) are stored 
in the AST, overwriting the AST's pointers to the old list. After the headjmerge 
and tail_merge functions merge the new list into the main list, the main list of 
SEMCOM_STMTs accurately reflects the semantics of the program represented by 
the AST. 

Incremental compilation may now begin. The values of the oldp, oendp, newp 
and nendp pointers are known. These values are used to call headjmerge, 
tailjmerge, extend, remove then insert. 
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_SEMCOM_remove_list performs similar tasks to those performed by 

_SEMCOM_replace_list. However, there is no new list, so the five functions are 

called with null values for the newp and nendp pointers. Effectively, remove is the 

only function of these five which will do anything when called by 
_SEMCOM_remove_list. 

The SEMCOMupdate function is invoked from the main loop in the outermost 
level of PECAN (pascalmain.c), to update the semantics after the execution and 
unexecution is completed. 



45 



Chapter 6 
Modifications to PECAN 

6.1. Aim of the Modifications 

The aim of this thesis project is to find some way of comparing the PECAN 
approach to incremental compilation (/?-type incremental compilation) with an 
a-type incremental compilation method. As mentioned in §2.2.2, balancing the 
costs of a-type and /3-type incremental compilation is the fundamental design 
question in the area of incremental compilation. 

To this end, the SEMCOM module has been modified so that PECAN can 
support three different types of incremental compilation: 

• incremental compilation (/?-type) as before: 

• procedure compilation (a-type incremental compilation with the smallest 
enclosing Pascal procedure or function 1 or main program as its 
recompilable unit); and 

• complete compilation (a-type incremental compilation with the entire 
program as its recompilable unit). "~ 

Further, the programmer is given the ability to specify that recompilation should 
be performed automatically (as before) or manually (i.e. at the programmer's 
request). 2 

Procedure compilation will occur automatically (regardless of whether compilation 
is automatic or manual) if 

• the programmer makes an editing change to a node in the AST which 
is not enclosed by the same procedure as was the last node to be 
changed (i.e the programmer has moved out of a procedure); and 

• the procedure which encloses the last node which was changed has not 
already been recompiled. 



Throughout this chapter the word "procedure" will be used to refer to a procedure or function (except 
where the context indicates otherwise). 

2 

"For a discussion of the question of when to trigger recompilation, see §2.3.2.2. 
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The main interest of this project is with procedure compilation; complete 
compilation was included for curiosity. 

Effectively, these modifications enable a comparison to be made of the relative 
merits of the approach to incremental compilation taken by PECAN and that 
taken by the Magpie system (see §3.3.5). Magpie performs its recompilation on a 
procedure basis. When the programmer has finished making editing changes to a 
procedure, that procedure is recompiled in the background. It is not practicable to 
implement background compilation in PECAN. Nevertheless the two methods can 
be compared within the PECAN system. By setting compilation to manual, and 
allowing PECAN to recompile each procedure after a number of editing changes 
have been carried out within that procedure, PECAN can be made to approximate 
the Magpie approach. 

6.2. Generality of the Modifications 

It will be recalled that, since page 19, PECAN has been considered not as an 
environment generator but as a Pascal environment. However, when modifying the 
SEMCOM module, thought must be given to that module's generality and whether 
any of the modifications are language specific. There is one modification that has 
been made to SEMCOM as part of this thesis project which assumes that the 
supported language is Pascal. One step in procedure compilation involves finding a 
node's enclosing procedure in the AST. 3 This is performed by moving up through 
the tree until a BLOCK node is found. BLOCK nodes are defined in the 
specification for Pascal, but there is no good reason to suppose that the 
specification for any other language will define its recompilable unit as a BLOCK. 4 

This flouting of generality can be justified for the purposes of this experimental 
comparison of compilation methods. If these modifications to PECAN were to be 
implemented in a more concrete fashion, the language specification could be altered 
to allow an explicit statement that a given construct is a recompilable unit. 
Provision for tagging constructs already exists. 5 Given the fact that the 
modifications made as part of this project were intended only to compare two 
different approaches to incremental compilation, it was deemed unnecessary to alter 
the definition of the specification language. 



°As described in §6.4.3. 

4 
Indeed, another language may specify more than one recompilable unit. In Pascal, BLOCK is 

sufficient as it makes up part of all three recompilable units: procedures, functions and the main 

program. 

e.g. the COMMENT label, used to indicate that the construct can be followed by a comment. 



47 



6.3. Ideal Modifications 

When an entire procedure is recompiled after a number of modifications have 
been made, the compiler has to replace the flow graph representation of the old 
procedure with a flow graph representation of the new procedure. Parts of a flow 
graph representation are removed when SEMCOM_STMTs are unexecuted. 
However, in the case of procedure compilation, it would be useful if the flow graph 
representation of the procedure could be removed in one step before a new 
representation is constructed by executing SEMCOM_STMTs. Unfortunately, the 
module which maintains the flow graph representation (the FLOW module) does 
not provide a function to remove large sections of the flow graph representation in 
one operation. It was decided to limit the modifications made in this thesis 
project to one module of the PECAN system (the SEMCOM module). 
Accordingly, no change has been made to the FLOW module. Removal of the 
flow graph representation of a procedure is implemented using the 
_SEMCO M_unexecute function. 6 

When comparing the results of a number of tests (see §6.6), the cost of 
unexecuting SEMCOM_STMTs in order to remove the flow graph representation of 
a procedure is ignored on the basis that it would be possible to perform the same 
operation in one step. 



The same is true when removing the flow graph representation of an entire program during complete 
recompilation. 
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6.4. Actual Implementation Details 

Details of the SEMCOM module code that has been modified or added in the 
course of this thesis project are given in Appendix B. 

6.4.1. The Compilation Monitor 

The SEMCOM module has been modified so as to provide compilation 
information in a window (the compilation monitor). For incremental compilation 
(as previously implemented) the compilation monitor displays: 

• the number of SEMCOM_STMTs eliminated by headjmerge; 

• the number of SEMCOM_STMTs eliminated by tail_merge\ 

• the number of SEMCOM_STMTs by which extend extends the new list; 

• the number of SEMCOM_STMTs unexecuted and removed by remove: 
and 

• the number of SEMCOM_STMTs executed by insert. 

This new window allows the programmer to set the type of compilation 
(incremental, procedure or complete) and to toggle the automatic / manual switch. 
There is also a COMPILE button which forces SEMCOM to compile using 
whichever compilation method was last chosen. 7 Using the COMPILE button has 
no effect if the compilation is set to incremental for the very good reason that 
incremental compilation is meaningless unless there is an amended node from which 
to construct a new list. 

The compilation information, together with information about which compilation 

method is current, is displayed in the compilation monitor. When this information 

can no longer be displayed on the screen, the screen scrolls to keep up with the 

latest information. The rest, of the new window's commands concern moving 
around within the window. 

It should be noted that it is possible to do some fairly horrible things to the 
SEMCOM representation of the AST by using the SEMCOM window in a naive 
way. For example, if the user were to set compilation to incremental and manual 
then no change to the AST would result in any compilation being performed. 
Even if compilation were then set to automatic, the effect of the changes made 



7 
When first invoked, the modified SEMCOM module is ready to perform automatic incremental 

compilation, just as it would have done before it was modified. 
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while compilation was set to manual would not be reflected in the SEMCOM 
representation of the program's semantics. The modifications to SEMCOM have 
been made for experimental purposes only. Although they provide a fairly robust 
view, that view is not intended to be foolproof. 

6.4.2. Incremental Compilation 

Incremental compilation is performed in precisely the same way as before except 
that calls to the various lower level functions have been moved into different 
functions. 

6.4.3. Procedure Compilation 

In order to perform incremental compilation on a procedural basis, SEMCOM 
makes a copy of the list of SEMCOM_STMTs which are associated with the 
procedure that is being edited before changes are made to that procedure. When a 
change is made to the AST, a list of SEMCOM_STMTs corresponding to the 
changed node is created and merged into the main list at the appropriate place. 
When compilation is triggered 8 the SEMCOM_STMTs in the list that represents 
the old procedure are unexecuted, 9 then the corresponding SEMCOM_STMTs in the 
main list are executed. 

The effect of this is much the same as if the entire procedure had been modified 
then incrementally compiled in the usual PECAN fashion, except that 

• there is no attempt to find the area of difference (i.e. no use of 
hea.d_rn.erge or tailjmerge): and 

• there is no extension of either list (i.e. no use of extend). 

The extend function is not required because procedural compilation is recompiling a 
recompilable unit. A recompilable unit has been defined as a construct of the 
language such that no change to that construct can affect the meaning of any part 



Procedure compilation can be triggered in one of three ways: manually (by use of the COMPILE 
button), automatically (every time a change is made), or because the programmer has edited a node of 
the AST that is outside the procedure. 

9 
For a discussion of the reasons why these SEMCOM_STMTs are unexecuted, see §6.3. 

Execution commences after the current items have been restored to their appropriate values in the 
mariner described in §5.5.2.5. 

See page 4. 
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of the program outside that construct. In other words, no change within a 
procedure can cause any local effects in the semantic specification statements 
beyond the end of that procedure; the use of the extend function would not result 
in any extension of the new list. 

6.4.4. Complete Compilation 

The modified SEMCOM module performs complete compilation in the following 
manner. When a change is made to the AST, a list of SEMCOM_STMTs 
corresponding to the changed node is created and merged into the main list, then 
the remove function is applied to the old list in order to remove the corresponding 
nodes from the flow graph representation of the program. When compilation is 
triggered 12 the SEMCOM_STMTs in the main list are unexecuted, then executed. 

This approach could be (uncharitably) described as being a bit "quick and dirty". 
After all, unexecuting the main list involves unexecuting SEMCOM_STMTs which 
have not yet been executed (specifically, all of the SEMCOM_STMTs that have 
been merged into the main list after changes to the AST). The 

_SEMCOM_unexecute function is sufficiently robust to handle this without 
incident, because it does not attempt to remove any non-existent nodes from the 
flow graph representation. 

6.5. Drawing Comparisons 

6.5.1. Choosing an Appropriate Benchmark 

Three possible benchmarks were considered for comparing the efficiency of the 
different methods of incremental compilation implemented by the modified 
SEMCOM module: elapsed time, code complexity and counting SEMCOM_STMTs. 

6.5.1.1. Elapsed Time 

The main problem with measuring elapsed time is that it is affected in 
unpredictable ways by such diverse and uncontrollable factors as the number of 
users on the machine, the amount of free memory available, etc. There is no way 
to predict whether a particular method will be benefited by the idiosyncracies of 
the system on which the tests are carried out (or the state of the machine at the 
time at which the tests are carried out). This method is plainly unacceptable. 



12 

Complete compilation can be triggered in either of two ways: manually (by use of the COMPILE 

button), or automatically (every time a change is made). 
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6.5.1.2. Code Complexity 

Profiling the C code that is actually executed by PECAN (i.e. counting C 
statements) would provide the most detailed possible comparison of compilation 
methods. This approach assumes that all of the functions which are invoked by 
the various compilation methods are provided by code which is roughly equivalent 
in its efficiency. Otherwise, one compilation method could compare unfavourably 
with another for no other reason than that it made frequent use of a function 
which was inefficiently written. This approach was deemed too dependent upon 
the implementation of PECAN to be a good benchmark. 

G.5.1.3. Counting SEMCOM_STMTs 

Another approach is to count the SEMCOM_STMTs that are processed. Rather 
than comparing the PECAN code executed by each method (as done when 
comparing code complexity) this approach examines the amount of the program 
under development that each method recompiles. No assumptions need be made 
about the relative efficiency of PECAN functions. 

For each compilation method, the compilation monitor provides information on 
the number of SEMCOM_STMTs that have been executed and unexecuted and (in 
the case of incremental compilation) the number of SEMCOM_STMTs that have 
been eliminated by headjmerge and tail_merge, and the extent to which the new 
list has been extended. From this information it is possible to derive a single 
number of SEMCOM_STMTs for comparison purposes. This number will be 
referred to as A. For incremental compilation, the number (A,) is 

• the number of SEMCOM_STMTs unexecuted by extend; plus 

• the number of SEMCOM_STMTs unexecuted by remove; plus 

• the number of SEMCOM_STMTs executed by insert. 

For both procedure and complete compilation, the number (A p or A„) is 

. the number of SEMCOM_STMTs executed. 

Note that, for procedure compilation and complete compilation, the number of 
SEMCOlVLSTMTs unexecuted is ignored (see §6.3). Counting SEMCOM_STMTs is 
the preferred method of comparison. 
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6.5.2. A Cautionary Note 

Before comparing A-values for the test cases, it is important to consider some 
inadequacies in the chosen approach to comparing compilation methods. The 
approach is deficient in three ways: 

• Procedure compilation and complete compilation have been built upon a 
system which was designed specifically for incremental compilation. 
PECAN's method of incremental compilation is being compared with 
that of Magpie (and traditional complete recompilation) within a 
framework which was constructed specifically for PECAN's method. 
Therefore, it must be expected (in a general sense) that the 
implementations of procedure and complete compilation will not be the 
most efficient. 

• Counting SEMCOM_STMTs makes no allowance for the considerable 
computation required to perform semantic updating after execution and 
unexecution (as described in §5.5.2.6). Comparing A-values in the 
suggested manner assumes that the amount of computation required by 
the updating process is proportional to the number of SEMCOM_STMTs 
executed and unexecuted. This assumption would appear to be 
reasonable; no one compilation method could be expected to require 
more updating per SEMCOM_STMTs than any other. However, this 
assumption has not been properly validated. 

• Counting SEMCOM_STMTs takes no account of the computation 
performed by the extend function in determining how far to extend the 
new list. This difficulty can be obviated by assuming that the 
computational cost of extending the new list by one SEMCOM_STMT is 
negligible when compared with the cost of executing or unexecuting one 
SEMCOM_STMT. This assumption is not necessarily invalid, but is by 
no means safe. 

A further extension of this thesis project would have been 

o to prove this assumption; or 

o to develop a method of incorporating the cost of extending the 
new list into the comparison method. 

These drawbacks must be considered when evaluating the results of tests described 
in §6.6. 

6.6. Testing 

To compare the different methods of compilation, a suite of Pascal programs was 
prepared. These programs were modified in various ways and A-values were 
calculated for each of the compilation methods. 



6.6.1. Choosing Test Programs 

When preparing the suite of test programs, a major factor constraining the choice 
of program was PECAN itself. PECAN will only support the development of 
small programs. Given that the test programs were restricted in size it was 

decided to use examples which were typical of the programs written by 
programmers when learning to code in Pascal. Four programs were used: two 
from [Findlay 81], one from [Jensen 78], and one from the author's salad days. 
These programs are listed in Appendix C. 

6.6.2. Modifications 

It is important that the modifications made to the test programs reflect the sorts 
of changes that programmers are likely to make to Pascal code during program 
development. Unfortunately, literature on this topic proved undiscoverable. 14 

Any consideration of the manner in which programmers modify programs is 
complicated by the fact that the environment in which the program is being 
developed may effect the way in which programs are debugged. For example, if 
the environment recompiles small changes immediately and quickly then the 
programmer may be encouraged to move freely around the source code when 
debugging. However, if the environment pauses to recompile each procedure after 
editing changes have been made within that procedure then the programmer may 
be tempted to stay within that procedure until all of the intended changes have 
been made. 

The sorts of editing changes made during program development are strongly 
linked to the errors that programmers tend to generate. After all, a major part of 
the debugging process is the removal of syntactic and semantic errors from the 
source code. The authors of [Garlan 84] claim that four errors account for 90% of 
all compiler error messages for Pascal programs developed by novice programmers. 



13 



One Pascal program of a mere 150 lines proved too large. 



Methods of measuring a programmer's aptitude for debugging are discussed in [Weinberg 71] (see 
pages 174-175). Unfortunately, no mention is made of the sorts of editing changes that apt, or inapt, 
programmers make when debugging. 
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In order of frequency these are: 

1. variable not declared; 

2. variable declared, but not used; 

3. variable declared and used, but not initialized; and 

4. type mismatch 

(of these four, only the first and the fourth are recognised as errors by PECAN). 

Armed with this information (and the author's well-developed intuitions regarding 
the sorts of editing changes made during the development of a Pascal program) a 
series of tests were designed. These tests are intended to be indicative of the 
kinds of changes which programmers make. 

Where a test required an initially incorrect program, the correct program was 
modified so that it was incorrect before modifications were performed in order to 
return the program to its original state. Eight tests were carried out. 

1. testi.p (§C.l) 

4 occurrences of the same (undefined) variable were changed to a 
defined variable (scalarproduct). All of the occurrences were in the 
same procedure (multiplymatnces). 

A, = 436 A p = 640 A c = 2018 

2. testi.p (§C.l) 

All 10 occurrences of the constant n were replaced with the integer 
constant 10. The constant n occurred in all 3 procedures. The changes 
were made in the order in which the instances of n occurred. 

Aj = 1640 A p = 2811 A c = 2128 

3. testi.p (§C2) 

A single change was made to the definition of the constant pi. 
Aj = 60 A p = 904 A c = 925 

4. testi.p (§C.2) 

4 occurrences of the same (undefined) variable were changed to a 
defined variable (degrees). AH 4 occurrence were in the main program. 

A, = 428 A p = 904 A c = 925 
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5. test-z.p (§C2) 

The invocation of the tan function was replaced by an expression which 
produced the same result, 15 then the tan function was removed from the 
program. 

A, = 472 A p = 1652 A c = 808 

6. test 3 .p (§C.3) 

A single corrective change was made to a misspelt function call in the 
main program. 

Aj = 100 A p = 147 A c = 447 

7. test^.p (§C3) 

All 3 occurrences of an undefined identifier within the factorial function 
were changed to references to that function. 

Aj = 436 A p = 147 A c = 447 

8. test^.p (§C4) 

5 more calls to the try function were added to the main program. 
Aj = 285 A p = 649 A Q = 670 



Full details of all of the compilation information extracted for each of these tests 
are given in Figure 6-1. In the case of incremental compilation, the column 
headings are 

H - SEMCOM_STMTs disposed of by head_merge; 

T - SEMCOM.STMTs disposed of by tail_merge; 

E - SEMCOM.STMTs by which extend extends the new list; 

R - SEMCOM_STMTs removed and unexecuted by remove; and 

1 - SEMCOM_STMTs inserted and executed by insert. 

In the case of procedure compilation and complete compilation the column headings 



UN - SEMCOIVLSTMTs unexecuted; and 
EX - SEMCOM STMTs executed. 



t.e. tan(degrees* pi / 1 So) was replaced by sin{degrees* pi / 180) f cos {degrees* pi / 1 80) 
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6.6.3. Comparison of Results 

In this section, the results are interpreted by simple comparison of A-values. 
The questions raised (in §6.5.2) about the efficacy of comparing A-values are 
ignored for the moment. 

In 5 out of the 8 tests 16 incremental compilation performed better than procedure 
compilation which performed better than complete compilation. 17 In 2 of the 
tests procedure compilation did not perform as well as complete compilation due 
to the large number of procedures which were edited. 

Only in test 7 was incremental compilation not the most efficient of the 
compilation methods (although it still performed better than complete compilation). 
In that test, 3 changes were made within a function. That function is so short 
that it can easily be understood how 3 changes required more work to compile 
separately than did the whole function. 

On the basis of these results, it would seem that unless the changes made within 
a recompilable unit affect a substantial amount of that recompilable unit (i.e. 
either the unit is very small, or the number of changes is large) then incremental 
compilation is more efficient than procedure compilation. 

In other words (and making no allowance for the computational cost of extending 
the new list) 8-type incremental compilation is more efficient than a-type 
incremental compilation. 

It is also interesting to note that the headjmerge and tail_merge functions 
discard very few SEMCOM_STMTs. This raises doubts as to the need to reduce 
the old list and the new list to the area of difference, when incrementally 
compiling Pascal structures. 



Tests 1, 3, 4. 6 and 



i.e. Aj < A p < Ap. 



Tests 2 and 5. 
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testl.p 2018 SEMGOM_STMTS 

(1) 4 undefined variables changed 



Incremental 


H 


T 


E 


R 


] 











1 


27 


] 


28 











1 


88 


1 


89 











] 


80 


1 


81 











] 


1 9 


1 


20 






total 





4 


214 


4 


218 


A i 


= 436 


Procedure 


UN 


EX 














640 


640 








A p 


= 640 


Complete 


UN 


EX 














2018 


2018 








A c 


= 2018 



Figure 6-1: Results of Modifying Test Programs 
(continued next page) 
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testi.p 
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Figure 6-1 continued 



(continued next page) 
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testl.p 925 SEMCOM_STMTS 

(3) Single change to value of constant at outer level 
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Incremental 


H 


T 


E 


R 


I 











] 


38 


1 


39 











] 


a 4 


] 


35 











1 


52 


] 


53 











] 


86 


1 


87 






total 





4 


210 


4 


214 


A > 


= 428 


Procedure 


UN 


EX 














904 


904 








A p 


= 904 


Complete 


UN 


EX 














925 


925 








A C 


= 925 



(5) Replace call to tan(X.) with sin(X)/cos(X), then delete tan function 
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Figure 6-1 continued 
(continued next page) 
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test^.p 447 SEMCOM_STMTS 

(6) Single undefined function call changed 
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(7) 3 occurrences of undefined function identifier changed 
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test^.p 670 SEMCOM_STMTS 

(8) 5 more calls to try added to main body 
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Figure 6-1 continued 
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Chapter 7 
Conclusions 

As stated in §6.6.3, the test results suggest that B-type incremental compilation 
(where the smallest amount of recompilation is performed after each editing 
change) is more efficient than o-type incremental compilation (where a structure of 
the programming language is chosen as the recompilable unit). However, there are 
a number of deficiencies in the comparison method chosen (as explained in §6.5.2). 
Of these deficiencies, the one which favours incremental compilation the most is the 
third: the fact that no account was taken of the computation performed by the 
extend function in order to determine how far to extend the new list. In the tests 
described in §6.6, some 35% of all of the SEMCOM_STMTs executed and 
unexecuted during incremental compilation were unexecuted by the extend function 
(i.e. the new list was extended to include those SEMCOM_STMTs). This indicates 
that the cost of extending the new list significantly affects the total cost of 
incremental compilation in PECAN. 

A more comprehensive comparison of the compilation methods would have taken 
account of the cost of the extend function. Profiling the C code which is actually 
executed by PECAN in each case would provide such a comparison. That method 
was not adopted for this project because it is too dependent upon the 
implementation of PECAN (see §6.5.1.2). However, profiling the code would be an 
appropriate benchmark if the environment in which the comparisons were made was 
not biased towards one method of compilation, as PECAN was towards incremental 
compilation. 

The comparisons that have been made between a-type and B-type incremental 
compilation do not allow any plenary statements to be made about the relative 
efficiency of the two methods. However, the performance of incremental 

compilation is not spectacularly better than that of procedure compilation, 
especially when the bias of the comparison method towards incremental compilation 
is taken into account. The results suggest that the gains in efficiency associated 
with /9-type incremental compilation are so small that they do not justify the large 
amount of programming work, and structural overheads, required to implement 
such a compilation mechanism. /3-type incremental compilation is faster, but not 
significantly faster, than a-type incremental compilation. 
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PECAN is a useful tool with which to test and demonstrate various aspects of 
programming environment design. However, it is only of limited use for examining 
general aspects of incremental compiler design. The entire structure of PECAN, 
from its language specification to the internal representation of its compilation 
module, is oriented towards /3-type incremental compilation. The experiments 
carried out as part of this thesis project demonstrate that PECAN is not the ideal 
environment in which to compare various methods of incremental compilation. 

The other achievements of this project are the thorough description of PECAN's 
compilation mechanism, and the implementation of the semantic actions view (a 
robust and useful view into the PECAN system). 
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Appendix A 
The Semantic Actions View 

A.l. The View and its Functions 

A new view has been developed for the PECAN system. This view provides a 
list of the SEMCOM_STMTs associated with the current node, as highlighted in 
the SDE and other program views. Information about the type of the current 
node and its position in the tree is also provided. Buttons are provided which 
allow the window to be scrolled so that all of the list may be examined. Other 
buttons provide tree traversal commands. The view will respond to changes of the 
current node in other views, and will cause changes to be reflected in other views 
when the tree traversal commands are used. 

An example PECAN screen, showing the semantic actions view, is reproduced in 
Figure A-l. The SDE and the flow view are on the left side of the screen, the 
semantic actions view is on the right. The SDE's cursor indicates the factorial 
identifier, and the flow view's cursor indicates the statement which assigns a value 
to that identifier. 

The semantic actions view indicates that the current AST node is an 
IDENTIFIER node. It is the first of two children, and has one child of its own. 
The list of SEMCOM_STMTs that follows is that list associated with the parent of 
the current node. Those SEMCOM_STMTs associated with the current node are 
indicated by arrows ("->") and are separated from the surrounding 
SEMCOM_STMTs by two horizoi Lai lines. The SEMCOM_STMTs associated with 
the parent of the rurrent node are displayed in order to put the current node's 
SEMCOM_STMTs into context. 

SEMCOM_STMTs are displayed in the following form: 1 

(location) : name index [value] @ pointer into AST 
The index is displayed as a decimal number. All other numbers are hexadecimal. 



In the same form as they are displayed by the _SEMCOM_dump function in semcommain.c. 
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Figure A-l: The Semantic Actions View 
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The scroll bar on the right side of the view indicates that approximately two 
thirds of the whole list 2 is currently displayed. The window onto the list can be 
scrolled to a desired point in the list by using the mouse to click on the 
corresponding point on the scroll bar, or by using the scroll buttons (TOP, 
BOTTOM, SCROLL UP, SCROLL DOWN, UP and DOWN). 3 

The tree traversal buttons move the current node around the AST. 4 IN moves 
to the first (leftmost) child of the current node, and OUT moves to its parent. 
NEXT moves to the current node's next sibling, and BACK moves to its previous 
sibling. The view is updated after each tree traversal command, and an event is 
triggered so that other views will also reflect the change. 

A. 2. Implementation Details 

The semantic action view is implemented by a new module called SAWDUST. 
The view is designed to be completely compatible with existing views. The event 
passing system (provided by the PLUM module, and described in §4.2.3) is used to 
provide a clean interface between SAWDUST and existing modules. The 
formatting, tracing and function-naming conventions adopted in other PECAN 
modules have been followed in SAWDUST. 



2 
i.e. the list of SEMCOM_STMTs associated with the parent of the current node. 

3 
UP moves the window up by one quarter of a screen: SCROLL UP moves the window up by a whole 



4 
More correctly, the tree traversal buttons affect which node of the AST is considered the current node. 

5 
SAWDUST stands for Semantic Action Window Display Using Several Tiles. This is a somewhat 

contrived acronym, but it pales into insignificance when compared with some of the acronyms which are 

used to name PECAN modules. 

Examples range from the utilitarian 

ASH - A Screen Handler, 

APIO - Apollo Input Only Package (an anagrammatical acronym), 

MFE - MAPLE Front End, and 

VD1 - Virtual Device Interface 
through the fairly plausible 

SGP - Simple Graphics Package, 

BRIM - Brown Image Format, and 

PLUM - Programming Language Utility Module 
rising to the giddy heights of 

BALSA - Brown University Algorithm Simulator and Animator, and 

WILLOW - Wonderful Integrated Language for Laying-Out Windows. 
Regrettably, the meanings of MAPLE and TULIP are unknown. 
In this context, SAWDUST seems almost credible as an acronym. 
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The SAWDUST module consists of four files: 



sawdust. h (§A.3) 

The external header file. 

Lists the externally accessible SAWDUST 

details of the module's trace facilities. 6 



functions and gives 



sawdust_local.hi (§A.4) 

The local header file. 

Includes a definition of the SAWDUST_SEMCOM_STMT type, 
which is identical in structure to the SEMCOM_STMT type but 
is defined in this way because the SEMCOM_STMT type is not 
externally accessible. 



sawdusimain.c 



(§A.5) 

Defines the SAWDUST window (using the WILLOW module from 

the Brown Workstation Environment) and displays 

SEMCOM_STMTs (using the VT module which provides a virtual 

terminal). Window movement and re-sizing is handled by 

WILLOW. 



sawdustbutton.c (§A.6) 

Button handling routines. 



Note that PECAN's main function (contained in pascalmain.c) is modified so as to invoke the 
SAWDV STinit function and to allow trace information to be passed to the SAWDV STtrace function. The 
previously unused Z debug switch was utilized. invoking PECAN with the -DZn option will cause the 
number n to be passed to SAWDV STtract. 
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A. 3. Program Listing: sawdust.h 



/•♦it******************************************************************* *y 

A ./ 

/* sawdust.h */ 

/* */ 

/* External definitions for the Semantic Actions Window */ 

A */ 

A *************************************************************»***»*****/ 

/* James Popple August 1987 */ 



/**********************************♦************************»***»»***»,,**/ 

/; */ 

/* Tracing definitions — use "-DZn" switch, where n gives */ 

/* the type(s) of trace */ 

/* */ 

A ********************************************************************»**/ 



#define SAWDUST_TRACE_OFF 

#define SAWDUST_TRACE_ON 1 

#def ine SAWDUST_TRACE_INT 2 

#def ine SAWDUST_TRACE_DEBUG 4 



/* external calls */ 
/* i nterna I ca I I s */ 
A debug */ 



/***********************»********************»*»***********»*»*******„»„*/ 

A */ 

/* Routine definitions */ 

/* */ 

/*•********•***•••*•***•********************** *********** *********** *****/ 



extern 
extern 



SAWDUST i n i t () ; 
SAWDUSTtraceQ; 



/* sawdustmai n. c 



/* end of SAWDUST. h */ 
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A. 4. Program Listing: sawdustjocal.hi 



A •/ 

/* sawdust_l ocal . h (from sawdust_l oca I . hi ) */ 

A */ 

/* Local definitions for Semantic Actions Window */ 

/• */ 



nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 
nc I ude 



<aspen . h> 
<f low.h> 
<ash.h> 
<mapl e . h> 
<vt .h> 
<w i t I ow . h> 
<symbol s.h> 
<type . h> 
<expr . h> 
<semcom. h> 
<acer . h> 
<sawdust . h> 



A •/ 

/* Data structures */ 

/* •/ 



#ifndef SAWDUST_MAIN 
#define PLUM_INCLUDE_ONLY 
#endif 



Mode SawdustDefs Is 



Type ASPEN_NODE From Mode ASPEN External; 



SAWDUST_SEMCOM_STMT => 
SEMCOM_next 
SEMCOM_last 
SEMCOM_type 
SEMCOM_ index 
SEMCOM_node 
SEMCOM value 



SAWDUST_SEMCOM_STMT , 

SAWDUST_SEMCOM_STMT , 

Short, 

Short, 

ASPEN_NODE, 

Univ Ptr: 



statement descriptor 
next statement 
previous statement 
statement type 
index for orgs 
tree node for orgs 
va I ue 



End 

#ifndef SAWDUST_MAIN 
#undef PLUM_INCLUDE_ONLY 
#endif 



/**********************************#*********+******♦****♦***♦*♦*********/ 

/* */ 

/* Constants «/ 

/* */ 

/***************************************************************♦********/ 



#define SAWDUST_FONT 



WILLOWfontname("PALM_F0NT") 
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/• */ 

/* Tracing definitions */ 

A •/ 



extern Integer SAWDUST tracelvl; 

#define TRACE if (SAWDUST tracelvl & SAWDUST_TRACE_ON) SAWDUST_t race 

#define ITRACE if (SAWDUST tracelvl k SAWDUST_TRACE_INT) SAWDUST_t race 

#define DTRACE if (SAWDUST_t race I v I & SAWDUST_TRACE_DEBUG) SAWDUST_t race 



/* Miscellaneous definitions */ 

•/ 



#def ine SDE_LOCATE( node , sy name , sy i d) PLUMevent_by_i d(SDE even t_cur rent ,\ 

node , syname , sy i d) 



/it********************************************************************** */ 

/* V 

/* Variable definitions */ 

/* V 



extern ASPEN_NODE SAWDUST_cur rent_node; 
extern ASPEN_NODE SAWDUST parent_node; 

extern Universal SDE event_cur rent ; 



/* */ 

/* Routine definitions */ 

/* */ 

/»***********»***»***************************»*♦*»♦*****»****»***»**»****/ 



/» sawdustma i n . c */ 
extern SAWDUST_d i sp I ay_node() ; 

extern SAWDUST_scro I I ( ) ; 

extern SAWDUST_t race() ; 



/* sawdustbut ton . c */ 

extern SAWDUST_but ton_top() ; 

extern SAWDUST_button_bot tom() ; 

extern SAWDUST_but ton_i n() ; 

extern SAWDUST_but ton_out () ; 

extern SAWDUST_but ton_next () ; 

extern SAWDUST_but ton_bock() ; 

extern SAWDUST_but ton_scro I l_up() ; 

extern SAWDUST_but ton_scrol l_down() ; 

extern SAWDUST_but ton_up( ) ; 

extern SAWDUST_but ton_down( ) ; 

extern SAWDUST_but ton_scro I I ( ) ; 



/* end of sawdust_l oca I . h */ 
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A. 5. Program Listing: sawdustmain.c 



/**********»***»*»****•*•***.*»»»»*»»**»*»»*»♦*****»«**»*****»**»*»»»***»/ 
/• •/ 

/* sawdustmain.c */ 

/• */ 

/* Main routines for the Semantic Actions Window */ 

/* */ 



#define SAWDUST_MAIN 

§ include "sawdust_loca I . h" 
# i nc I ude <sem_ reader . h> 



A •/ 

/* Local storage definitions »/ 

/• */ 



ASPEN_NODE SAWDUST current_node = NULL; 

ASPEN_NODE SAWDUST_parent_node - NULL; 

Universal SDE event_cur rent ; 

Integer SAWDUST tracelvl = 0; 



static ASH_WINDOW 
static Integer 



SAWDUST window = NULL; 

SAWDUST_vtid = -1 ; 



stat i c Boolean 
static Integer 



eolfg = TRUE; 
sawdust font = 0; 



stat i c Integer 
static Integer 



num_l i nes = 0; 
num_col s = 0; 



/» */ 

/* Forward Definitions */ 

/* */ 

/**************»*******»**************************»**»»»****♦*»»*********/ 



static 
stat i c 
static 
stat i c 
stat i c 
stat i c 
stat i c 
static 
stat i c 



sawdust_record() ; 
new_sawdust_wi ndow() ; 
sawdust_cont rol () ; 
sawdust_sde_cur rent () ; 
setup_sawdust_wi ndow() ; 
remove_sawdust_wi ndow() ; 
set_wi ndow_name() ; 
sawdust_def ine_scroll(); 
sawdust_dump() ; 
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/* •/ 

/* Window Definitions */ 

/• »/ 

/ft***********************************************************************/ 



static WILLOW_DEFN sawdust_wi ndow = \ 
WILLOW_CLASS_USER, 

| "SAWDUST", "pecan. icons", 'C £, 
} 300 , 1 00 , 800 , 1 024 , 400 , 300 , 1 , 1 £ , 
j ASH_WINDOW_HIT_PARENT, 
W I L LOW_T 1 T L E_T AB_S ENS E , 
WILLOW_INSTANCE_SAVED_1 j, 
new_sawdust_wi ndow, 
NULL, 
I I { "IN" j, 

W I L LOW_ LOCAT I ON_T A I L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTIONJJSER(SAWDUST_button_in) } , 
J | "OUT" } , 

W I L LOW_ LOCAT I ON_T A I L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACT10N_USER(SAWDUST_button_out) j , 
j 1 "NEXT" \, 

W I L LOW_ LOC AT I ON_T A I L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_next) { , 
{ { "BACK" } , 

W I L LOW_ LOCAT I ON_T A I L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_back) } , 
\ { "TOP" |, 

W I L LOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_top) \ , 
J J "BOTTOM" I , 

WILLOW_LOCATION_BOTTOM, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_bottom) j , 
J | "SCROLL UP" j , 

WI LLOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_scrol l_up) \ , 
J \ "SCROLL DOWN" f , 

W I L LOW_ LOC AT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SAWDUST_button_scrol l_down) j , 
} J "UP" }, 

W I L LOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_N0RMAL, 

WILLOW_ACTION_USER(SAWDUST_button_up) \ , 
\ \ "DOWN" \ , 

W I L LOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL, 

WILLOW_ACTION_USER(SAWDUST_button_down) \ , 
1 J "SCROLL" }, 

WILLOW_LOCATION_R, 
WILLOW_BUTTON_REGION, 

WILLOW_ACTION_USER(SAWDUST_button_scrol I) ], 
| | "Move", "pecan. icons" , '1', 0, 0, 1 \, 
W I L LOW_ LOC AT I ON_U L , 
WI LLOW_BUTTON_NORMAL , 
WILLOW_ACTION_MOVE( DEFAULT) \, 
{ | "Size", "pecan . i cons" , '0', 0, 0, 1 \, 
W I L LOW_ LOC AT I ON_UR , 
WI LLOW_BUTTON_NORMAL , 
WILLOW_ACTION_TYPE_AUX9, NULL, WI LLOW_ACTION_DEFAULT £ 
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} } "Remove" \, 

WILLOW_LOCATION_TITLE, 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_ICON(DEFAULT) {, 
{ \ "Push" }, 

W I L LOWL LOCAT I ON_T I T L E , 

WI LLOW_BUTTON_NORMAL , 

W I L LOW_ACT I ON_PUSH ( D E FAU LT ) } , 
I } "Pop" }, 

W I L LOW_ LOC AT I ON_T I T L E , 

WI LLOW_BUTTON_NORMAL , 

W I L LOW_ACT I ON_POP ( D E FAU LT ) J 
I 



/* V 

/* SAWDUSTinit — initialize sawdust display */ 

/* */ 



SAWDUST i n i t ( ) 

\ 

TRACE("SAWDUSTinit"); 

SAWDUST window = NULL; 

SAWDUST vtid = NULL; 

SAWDUST current_node = NULL; 

eolfg = TRUE; 
num_l i nes = 0; 
num_col s = 0; 

WILLOWdef i ne_wi ndow(&sawdust_wi ndow) ; 

SDE_event_current = PLUMi nq_event_i d("SDE_$CURRENT") ; 

PLUMaccept_event(sawdust_sde_current , "SDE_$CURRENT") ; 

I; 



/<<.t«ilii..i.»i.<tt"t.t.>i.t«"i»>>...i.n.....>.«....,........>t.t..t./ 

/. */ 

/* SAWDUSTtrace — set trace flag */ 

/• */ 



SAWDUSTtrace(lvl) 
Integer I v I ; 

i 

TRACE( "SAWDUST trace"); 

SAWDUST tracelvl = Ivl ; 
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/* */ 

/* SAWDUST_di splay_node — display current node */ 

/* */ 

/V* ******************** ******»*****»******»*»**********************»*****/ 



int 

SAWDUST_d i sp I ay_node ( ) 

i 

register String header; 

ITRACE( "SAWDUST_d i sp I ay_node") ; 

VT$PUSH(SAWDUST_vt id); 
VT$MOVE(0,0); 
VT$ERASE_SCREEN; 
VT$POP ; 

SAWDUST_scrol 1(0,0,0); 

if (SAWDUST current_node != NULL) { 

if ((SAWDUST parent_node = ASPEN i nq_pa rent (SAWDUST cur rent_node)) 

— NULL) | 
header = "'%s' - PARENT DOES NOT EXIST - children: %d" ; 
sawdust_record( header , ASPEN ru I e_name 

(ASPENinq_rule(SAWDUST cur rent_node) ) , 

ASPEN inq_arity (SAWDUST cur rent_node) ) ; 

I 

e I se | 

header = "'%s' - Child number: %d of %d - children: %d"; 
sawdust_ record (header , 

ASPENrule_name 

(ASPEN i nq_ru I e (SAWDUST cur rent_node) ) , 

ASPEN inq_son_number (SAWDUST cur rent_node) + 1, 

ASPEN inq_arity (SAWDUST parent_node) , 

ASPEN i nq_ari ty (SAWDUST current_node)) ; 

i; 
I: 

sawdust_record 

( " " ) ; 

if (SAWDUST current_node = NULL) \ 

sawdust_record("NULL") ; 

I 

e I se | 

sawdust_dump() ; 



sawdust_record 

(" 

sawdust_def ine_scroll(); 
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/**»******»***********»*******»****»*♦********»»»***********»************/ 
A V 

/* SAWDUST_scrol I — scroll VT window */ 

/* */ 



SAWDUST_scrol l(dl ,dc,abs) 
Integer d I ,dc ; 
Integer abs; 



{ 



h 



Integer r I , re ; 
Integer c I , cc ; 
register Integer b; 

ITRACE("SAWDUST_scrol I %d %d %d" , d I , dc ,abs) ; 

VT$PUSH(SAWDUST_vt id) ; 
VT$NO_SCROLL; 
VT$INQ_REGION(ier I ,&rc) ; 

if (dl != 0) | 

rl += d I »(num_l i nes/4) ; 

\ 
else if (dc != 0) j 

re += dc*(num_col s/4) ; 

I 

e I se { 

VT$INQ_CURRENT(«cc I ,&cc) ; 

ITRACE("\tscrol I absolute %d %d %d %d %d" , rl , rc.cl , cc.abs) ; 

b = MAX(c I , r l+num_l i nes) ; 

b = abs*b/100-num_l i nes/2; 

if (b < 0) b = 0; 

else if (b>cl)b=cl; 

rl = b; 

re •» 0; 



ITRACE("\tscrol I to %d %d",rl,rc); 

VT$REGION(rl , re); 
VT$POP; 

sawdust_def i ne_scrol I ( ) ; 



/* */ 

/* SAWDUST_t race — output trace information */ 

/* */ 



SAWDUST_t race (msg , a 1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 , a9 ) 

St r i ng msg ; 

I nteaer a1 ,a2, a3, a4,a5,a6,a7, a8,a9; 
i 

Character mbf [1024]; 



spr i nt f (mbf ,msg , a1 ,02,03,04,05,06,07,08,09) ; 
printf ("SAWDUST: %s\n",mbf); 
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/•ft*********************************************************************/ 
A •/ 

/* sawdust_record — put message in transcript for Semantic */ 

/* Actions Window */ 

/* */ 

/************** **********************************************************/ 



stat i c 

sawdust_record(msg, a1 ,a2,a3,a4) 

St r i ng msg; 

Universal a1,a2,a3,a4; 

i 

Character buf [256] , buf 1 [256] ; 

DTRACE("sawdust_ record %s" ,msg) ; 

VT$PUSH ( SAWDUST_ v t i d ) ; 
VT$NO_SCROLL; 
VT$FONT(sawdust_font) ; 

if (leolfg) VT$OUT("\n"); 

sprintf(buf ,msg,a1 , a2,a3, a4) ; 
sprintf(buf1 , "%s\n" , buf ) ; 
VT$0UT(buf1); 

#ifdef VAX 

printf ("%s" ,buf 1) ; 
#endif 

VT$POP ; 

eolfg = TRUE; 



/♦♦♦♦♦♦♦A********************************** ********************#*********/ 

/* */ 

/* new_sawdust_wi ndow — set up sawdust display window */ 

/« */ 

/************************************************************************/ 



static 
new_sawdust_wi ndow() 

I 

register ASH_WINDOW w; 

DTRACE( "new_sawdust_wi ndow' ) ; 

w = ASHi nq_wi ndow() ; 

SAWDUST_window = w; 

ASHset_cont rol (sawdust_cont rol ) ; 

setup_sawdust_wi ndow() ; 
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/*****************•******•*************************** * **** * **** * * **** *» * * / 

/* */ 

/* sawdust_cont rol — control message interpreter */ 

/* */ 

/************************************************************************/ 



stat i c 

sawdust_cont rol (msg ,w) 

St r i ng msg; 

ASH_WINDOW w; 



i 



DTRACE("sawdust_control %s 0x%x" ,msg ,w) ; 

if (STREQL(msg,"PDS$NEXT")) return ASH_CONTROL_OK; 

if (STREQL(msg,"ASH$RESIZE")) I 
setup_sawdust_wi ndow() ; 

i 

else if (STREQL(msg,"ASH$INQ_RESIZE")) J 
remove_sawdust_wi ndow( ) ; 

\ 

else if (STREOL(msg,"ASH$REMOVE")) I 
remove_sawdust_wi ndow() ; 

SAWDUST window = NULL; 

SAWDUST current_node = NULL; 

I; 

return ASH_CONTROL_REJECT ; 



/* ■t(<t«t«l««tt>«t«tt«tt»»«tt«ll>tttltt«l««tMltttt«l ** * * * * ******** * * * ** */ 
/* */ 

/* sawdust_sde_current — handle sde_locate event */ 

A */ 

/************************************************************************/ 



stat i c 

sawdust_sde_cur rent (evt , act , node , name , i d) 

St r i ng evt ; 

PLUM_EVENT_ACTION act; 

ASPEN_NODE node; 

String name; 

Integer id; 

\ 

DTRACE("sawdust_sde_l ocate %d 0x%x %s" , act , node , name) ; 

if (act != PLUM_EVENT_DO) return; 

if ((SAWDUST window != NULL) && (name != "SAWDUST") kk 

(SAWDUST_current_node ! = node)) { 

SAWDUST current_node = node; 

set_wi ndow_name() ; 

SAWDUST_d i sp I ay_node ( ) ; 



/* **** * ************************************ * ******** 4 * ***************** * */ 

/♦ »/ 

/* setup_sawdust_wi ndow — set up sawdust display window */ 

/• •/ 

/tit**********************************************************************/ 



stat i c 
setup_sawdust_wi ndow() 

i 

DTRACE("set up_sawdust_wi ndow") ; 

ASHpush_wi ndow() ; 

ASHse I ec t (SAWDUST_w i ndow) ; 

SAWDUST vtid = VTopenQ; 

VT$PUSH( SAWDUST vt id) ; 

VT$SCROLL; 

sawdust_font = VT$LOADFONT(SAWDUST_FONT) ; 
VT$FONT(sawdust_font); 
VT$INQ_SIZE(&num_l i nes ,4cnum_co I s) ; 

VT$POP ; 

ASHpop_w i ndow( ) ; 

set_wi ndow_name() ; 

SAWDUST_d i sp 1 ay_node ( ) ; 



h 



/. */ 

/* remove_sawdust_wi ndow — remove sawdust display window */ 

/* */ 



static 
remove_sawdust_wi ndow() 

DTRACE( " remove_sawdust_wi ndow" ) ; 

VTc I ose(SAWDUST_vt id); 

I; 
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/it**********************************************************************/ 

/• •/ 

/* set_wi ndow_name — put proper name in title for Semantic */ 

/* Actions Window */ 

/* */ 



stat ic 

set_wi ndow_name() 

I 

Character buf[256]; 

DTRACE("set_wi ndow_name") ; 

if (SAWDUST window != NULL) J 

ASHpush_wi ndow( ) ; 

ASHse I ect (SAWDUST wi ndow) ; 

if (SAWDUST current_node = NULL) J 

strcpy(buf, "Semant ic act i ons") ; 

\ 
e I se { 

spr i ntf (buf , "Semant ic actions for %s", 

ASPEN inq_name (SAWDUST cur rent_node)) ; 

h 

ASHset_wi ndow_name(buf ) ; 
ASHpop_w i ndow( ) ; 

}; 



/*************** * ** * ********************* **«*****************************/ 

A •/ 

/* sawdust_def i ne_scro! I — define scroll region »/ 

/• •/ 



stat i c 

sowdust_def ine_scroll() 

i 

1 ^teger r I , re ; 
1 nteger c I , cc ; 
register Integer b; 

DTRACE("sawdust_def i ne_scrol I ") ; 

VT$PUSH(SAWDUST_vt id); 
VT$INQ_CURRENT(&c I ,&cc) ; 
VT$INQ_REGION(4:r I ,trc) ; 
VT$POP ; 

b = MAX(c I , r l+num_ I i nes) ; 

WILLOWbutton_feedback(SAWDUST_wi ndow, "SCROLL" .TRUE, 

WI LLOW_SCROLL_REGION( r I * 1 00/b , 

(r l+num_i i nes)* 1 00/b)) ; 
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/*** ********************** ************************** *********************/ 

/• */ 

/* sowdust_dump — dump semantic statements between head and tail */ 
/* of parent node */ 

/• */ 



stat i c 
sawdust_dump() 

I 

SAWDUST_SEMCOM_STMT cur rent_head, cur rent_ta i I , parent_head, parent_ta i I ; 
register SAWDUST_SEMCOM_STMT s, Is; 
St r i ng i ndent ; 

DTRACE("sawdust_dump") ; 

ASPEN i nq_semant i cs (SAWDUST cur rent_node, 

4cur rent_head, &current_ta i I ) ; 

if (SAWDUST parent_node != NULL) J 

ASPEN i nq_semant ics( SAWDUST parent_node, 

&parent_head , &parent_ta i I ) ; 

I 

e I se 

\ 

parent_head = cur rent_head; 
parent_tail = cur rent_ta i I ; 
i 

indent = ""; 

if (parent_head != NULL) Is = parent_head -> SEMCOM_last; 

for (s = parent_head; s != NULL; 

s = (s = parent_tail ? NULL : s -> SEMCOM_next ) ) { 

if (s = cur rent_head) | 
sawdust_record 

(" "): 

indent = "-> "; 



sawdust_record("%s(0x%x) :\t%s %d\t [0x%x]\t©0x%x" , 

indent ,s,SEMDATAT ABLE [s->SEMCOM_type] .SEM_stmt_name , 
s->SEMCOM_i ndex , s->SEMCOM_va I ue , s->SEMCOM_node) ; 

if (s->SEMCOM_last != Is) sawdust_record("\t ***BAD LAST 0x%x", 

s->SEMCOM_last); 
I s = s ; 

if (s = cur rent_ta i I ) { 
sawdust_record 
( •• ■■ ) . 

i ndent = "" ; 



/* end of sawdustmai n . c */ 
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A.6. Program Listing: sawdustbutton.c 

/• */ 

/* sawdustbutton.c */ 

/• v 

/* Button handling routines for the Semantic Actions Window */ 

/* */ 

/»ttt»t>«.ttttttmt>ntt»m«ttttt*tt>itt>t.». ****»»**»**»»*♦*»»»»*»»*♦*/ 

^include "sawdust_l oca I . h" 



/* */ 

/* SAWDUST_button_top — handle TOP button */ 

/* */ 

/ft****************************************************** «**»*»»*» ** * *» * * * / 



int 
SAWDUST_button_top(di r) 

WI LLOW_ACT ION_MODE d i r ; 
I 

ITRACE("SAWDUST_button_top %d" ,dir); 

if (dir != WILLOW_ACTION_D0) return; 
SAWDUST_scrol I (0,0,0); 
return TRUE; 

h 



/* */ 

/* SAWDUST_button_bottom — handle BOTTOM button */ 

/* V 



int 
SAWDUST_button_bottom(di r) 

W I LLOW_ACT I ON_MODE dir; 
i 

ITRACE("SAWDUST_button_bottom %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol I (0,0, 100) ; 
return TRUE; 



/************************************* ***********************************/ 

/. »/ 

/* SAWDUST_button_in — handle IN button */ 

A V 



int 
SAWDUST_button_in(dir) 

WILLOW_ACTION_MODE dir; 

i 

register ASPEN_NODE s; 

ITRACE("SAWDUST_button_in %d",di r); 

if (dir != WILLOW_ACTION_DO) return; 

if ((SAWDUST current_node != NULL) IcSn 

((s = ASPEN inq_son( SAWDUST current_node,0)) != NULL)) 

SAWDUST current_node = s; 

SDE_LOCATE(SAWDUST_current_node, "SAWDUST" ,0) ; 

SAWDUST_d i sp I ay_node ( ) ; 



return TRUE; 



/**** ******************************** ********************* ********* ******/ 

/• */ 

/* SAWDUST_button_out — handle OUT button */ 

A */ 



int 

SAWDUST_button_out(di r) 
WI LLOW_ACTION_MODE dir; 

I 

register ASPEN_NODE p; 

ITRACE( "SAWDUST_but t on_out %d" , d i r) ; 

if (dir != WILLOW_ACTION_DO) return; 

if ((SAWDUST_current_node != NULL) lc& 

((p = ASPENinq_parent (SAWDUST cur rent_node) ) != NULL)) \ 

SAWDUST current_node = p; 

SDE_LOCATE(SAWDUST_current_node, "SAWDUST",©); 

SAWDUST_d i sp I ay_node ( ) ; 



return TRUE; 

h 
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A */ 

/* SAWDUST_button_next — handle NEXT button »/ 

/» */ 



int 

SAWDUST_but t on_next (d i r) 
WILLOW_ACTION_MODE dir; 

\ 

register Integer s; 
register ASPEN_NODE p; 

ITRACE("SAWDUST_button_next %d",dir); 

if (dir != WILLOW_ACTION_DO) return; 

if ((SAWDUST current_node != NULL) && 

((P = ASPENinq_parent(SAWDUST cur rent_node)) != NULL) ScSc 

((s = ASPEN inq_son_number( SAWDUST cur rent_node)) < 

ASPENinq_arity(p)-1)) \ 

SAWDUST current_node = ASPEN i nq_son(p, s+1 ) ; 

SDE_LOCATE(SAWDUST current_node , "SAWDUST" ,0) ; 

SAWDUST_d i sp I ay_node ( ) ; 



return TRUE; 

h 



/* */ 

/* SAWDUST_button_back — handle BACK button */ 

/• */ 



int 
SAWDUST_button_back(di r) 

WI LLOW_ACTION_MODE dir; 
1 

register ASPEN_NODE p; 

regi ster Integer s ; 

ITRACE("SAWDUST_button_back %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

if ((SAWDUST current_node != NULL) Sc& 

((s = ASPEN inq_son_number( SAWDUST current_node)) > 0) Sclc 

((p = ASPENinq_parent(SAWDUST_current_node)) != NULL)) { 

SAWDUST current_node = ASPEN i nq_son(p, s-1 ) ; 

SDE_ LOCATE (SAWDUST cur rent_node , "SAWDUST" ,0) ; 

SAWDUST_d i sp I ay_node ( ) ; 

u 

return TRUE; 
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A */ 

/* SAWDUST_button_scrol l_up — handle SCROLL UP button */ 

/* •/ 



int 

SAWDUST_button_scrol l_up(di r) 

W I L LOW_ACT I ON_MOD E d i r ; 
! 

ITRACE("SAWDUST_button_scrol l_up %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol I (-4,0,0); 
return TRUE; 



A ./ 

/* SAWDUST_button_scrol l_down — handle SCROLL DOWN button */ 
/* */ 



int 

SAWDUST_button_scrol l_down(di r) 

WI LLOW_ACT I ON_MODE dir; 
I 

ITRACE("SAWDUST_button_scrol l_down %d" ,di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol 1(4,0,0); 
return TRUE; 



/* •/ 

/* SAWDUST_button_up — handle UP button */ 

/* »/ 



int 
SAWDUST_button_up(di r) 

WILLOW_ACTION_MODE dir; 
i 

ITRACE("SAWDUST_button_up 5£d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol I (-1 ,0,0); 
return TRUE; 
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/ ********************************************* 9 ******* * * * * * * • ** ** * ** * ** * */ 

/* */ 

/* SAWDUST_button_down — handle DOWN button */ 

/• V 

/************************************************************************/ 



int 

SAWDUST_button_down(di r) 
WI LLOW_ACT ION_MODE d i r ; 



\ 



ITRACE("SAWDUST_button_down %d",di r) ; 
if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol I (1 ,0,0); 
return TRUE; 



/* •/ 

/* SAWDUST_button_scrol I — handle scroll bar */ 

/• V 

/* ************************* * **** * *************** * * * * * * t) * * * * * * * * * * ** * * * * */ 



int 

SAWDUST_button_scrol I (di r) 
WI LLOW_ACT ION_MODE dir; 

\ 

ITRACE("SAWDUST_button_scrol I %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SAWDUST_scrol I (0 ,0, WILLOW i nq_scro I I ()); 
return TRUE; 

i; 

/* end of sawdustbut ton . c */ 
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Appendix B 
The SEMCOM Module 

As explained in Chapter 5, the SEMCOM module handles incremental compilation 
in PECAN. This appendix contains a description of, and selected program listings 
from, the files that make up that module as modified in the manner described in 
Chapter 6. 

B.l. The Compilation Monitor 

The SEMCOM module has been modified so as to provide a new view; a window 
which displays compilation information. Buttons are provided which allow the 
window to be scrolled so that all of the information may be examined. Other 
buttons allow the programmer to choose the method of compilation to be employed 
when a modification is made to the AST. 

An example PECAN screen, showing the compilation monitor, is reproduced in 
Figure B-l. The SDE and the flow view are on the left side of the screen, the 
compilation monitor is on the right. The scroll buttons are identical to those 
provided by the semantic actions view, and explained in §A.l. In addition, the 
CLEAR button clears the screen, erasing any information which may have been 
displayed on it. 

The INCREMENTAL, PROCEDURE and COMPLETE buttons choose the 
compilation method that will be next used. Each choice is echoed on the screen 
when made. The COMPILE button forces SEMCOM to compile immediately 
(unless the compilation method is incremental). The AUTO button toggles 
automatic recompilation. When automatic recompilation is set, compilation is 
triggered by every change made to the AST. When automatic recompilation is not 
set, 1 compilation is not performed unless the COMPILE button is used or (if 
procedure compilation is selected) a change is made to the AST outside the 
procedure within which the last change was made. 



The AUTO button becomes the MANUAL button when automatic recompilation is not set in order to 
display the state of automatic recompilation. 



8(3 




Figure B-l: The Compilation Monitor 



87 



B.2. Implementation Details 

The modified SEMCOM module consists of eight files: 

semcom.h (§B.3) 

The external header file. 

Lists the externally accessible SEMCOM functions and gives 

details of the module's trace facilities. 

se.mcom_local.ht (§B.4) 

The local header file. 

Includes the definition of the SEMCOM_STMT type. 

semcomrnain.c (Not listed - modifications to SEMCOM did not significantly 
affect this file.) 

Includes the initialization and trace routines, and sem_event_node 
which is invoked by PLUM when an ASPEN_$NODE_CHANGE 
event is broadcast. 



semcomstml.c 



(§B.5) 

Maintains lists of SEMCOM_STMTs. Includes the 

_SEMCOM_replace_list and _SEMCOM_remove_list functions 
(modified to handle procedure compilation and complete 
compilation) and the new functions SEMCOM_force_compilation 
(which implements the COMPILE button), copyjist (which makes 
a copy of an existing list of SEMCOM_STMTs), and 
enclosingjblock and enclosing jprogr am (which find the enclosing 
node of the appropriate type in the AST). 



semcomeval .c 



(§B-6) 

Contains the head_merge, tail_merge, extend and insert functions. 
The remove function is renamed to SEMCOMjremove (because 
the modifications required that it be visible to other files in the 
SEMCOM module). These low-level functions are called by the 
new functions SEMCOM_change_incremental. 

SEMCOM _change_procedure and SEMCOM _change_complete 
which replace the function _SEMCOM_change. 



semcomexec.c 



(Not listed - modifications to SEMCOM did not affect this file.) 
Handles the execution and unexecution of SEMCOM_STMTs. 
_SEMCOM_execute and _SEMCOM_unexecute both use a large 
switch statement with a case for each type of SEMCOM_STMT. 
Maintains and modifies the values of the current items (using 
_SEMCOM_set_current, _SEMCOM_get_currents, etc.). 



semcomwindow.c (§B.7 - New file) 

Defines the SEMCOM compilation monitor (using the WILLOW 
module) and controls the display of information on that screen. 

semcombutton.c (§B.8 - New file) 

Button handling routines. 



Only those functions that were added or altered when the SEMCOM module was 
modified have been included in the program listings that follow. 
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B.3. Program Listing: semcom.h 



/«***********»**♦*»************»******»»»*»********»»***»»******»***»**«*/ 
A •/ 

/* semcom.h »/ 

A */ 

/* External definitions for using incremental symbol compiler */ 

// */ 

/****»****»*»*********»****»»*»»»*******»*»*»*********************»»*,,***/ 

/* Copyright 1984 Brown University — Steven P. Reiss */ 

/* Modified James Popple September/October 1987 */ 



/* */ 

/* Tracing definitions — use "— Dsn" switch, where n gives */ 

/* the type(s) of trace. */ 

/* »/ 

/»*»**»*»*»»***»*»*******»**»*»»*»»»♦»»*»»**»»*»*»***»*»»»»**»»»»«,»»»,*»/ 



#define SEMCOM_TRACE_OFF 

#define SEMCOM_TRACE_ON 1 

#define SEMCOM_TRACE_INT 2 

#define SEMCOM_TRACE_DEBUG 4 

#define SEMCOM_TRACE_DUMP 8 

# d e f i n e S EMCOM_TRAC E_COMP I L E 16 



/* General routines */ 

/» */ 



extern 

extern 

extern 

extern SYM_REF 

extern EXPR_NODE 

extern TYPE_DEF 

extern FLOW_NODE 



SEMCOMinit(); 

SEMCOMtraceQ; 

SEMCOMupdateQ; 

SEMCOMinq_ref (); 

SEMCOMinq_expr() 

SEMCOMinq_type() 

SEMCOMinq_f low() 



extern Integer 
extern Boolean 



SEMCOMsuggest_text ( ) ; 
SEMCOMtest_begin(); 



/* end of semcom.h */ 
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B.4. Program Listing: semcomjocal.hi 



/************************************************************************/ 
/' ♦/ 

/* semcom_l oca I . h (derived from semcom_l ocal . hi ) */ 

/• •/ 

/* Local definitions for incremental symbol compiler */ 

/• V 

y* ********************************************************************** *y 



nclude <plum.h> 
nclude <aspen.h> 
nclude <symbols.h> 
nc I ude <vt . h> 
nc I ude <wi I I ow. h> 
nclude <type.h> 
nclude <expr.h> 
nc I ude <f low. h> 
nclude "semcom.h" 
nclude <ash.h> 



/************************************************************************/ 
/• •/ 

/* Data structures */ 

/• */ 

/•♦in*********************************************************************/ 



#ifndef SEMCOM_MAIN 
#define PLUM_INCLUDE_ONLY 
#endif 



Mode SemcomDefs Is 

Type ASPEN_NODE From Mode Aspen External; 
Type SYM_SCOPE From Mode Symbols External; 
Type SYM_NAME From Mode Symbols External; 
Type SYM_OBJECT From Mode Symbols External; 
Type SYM_OBJECTSET From Mode Symbols External 
Type SYM_REF From Mode Symbols External; 
Type FLOW_NODE From Mode Flows External; 
Type TYPE_DEF From Mode Types External; 
Type EXPR_NODE From Mode Exprs External; 



Type SEMCOM_COMPILATION_TYPE Is Enum 
SEMCOM_COMP_ I INCREMENTAL , 
SEMCOM_COMP_PROCEDURE , 
S EMCOM_COMP_COMP L ET E ; 



SEMCOM_NAME_INFO => 
SEMCOM_name 
SEMCOM_f i rst 
SEMCOM act ive 



St r i ng , 
SEMCOM_STMT , 
Boo I ean ; 



info on name basis 
the name i tse I f 
f i rst symbo I stmt 
name i s act i ve 



SEMCOM_STMT => 

SEMCOM_next 

SEMCOM_last 

SEMCOM_type 

SEMCOM_ index 

SEMCOM_node 

SEMCOM_value 



SEMCOM_STMT , 

SEMCOM_STMT , 

Short, 

Short, 

ASPEN_NODE, 

Univ_Ptr; 



statement descriptor 
next statement 
previous statement 
statement type 
index for args 
t ree node for args 
va I ue 
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SEMCOM_CUR => 

SEMCOM_cur_scope SYM_SCOPE, 

SEMCOM_cur_ref ! SYM_REF, 

SEMCOM_cu r_f i ow : FL0W_N0DE , 

SEMCOM_cur_type : TYPE_DEF, 

SEMCOM_cur_expr : EXPR_N0DE, 
SEMCOM_cur_use_scope : SYM_SCOPE, 
SEMCOM_cur_bui ld_type : TYPE_DEF, 

SEMCOM_cur_mode : Integer; 



current values 



End 



#ifdef PLUM_INCLUDE_ONLY 
fundef PLUM_INCLUDE_ONLY 
#endif 



/************************************************************************/ 
/* */ 

/* Constants */ 

/• */ 

/it**********************************************************************/ 



#define SEMCOM_FONT WILLOWf ontname( "PALM_FONT") 



/****••******•*••************•******* * *** * ******* * ********************** */ 

/* */ 

/* Tracing definitions */ 

/* •/ 

/***** ************************************************ *******************/ 



extern Integer 
extern Boolean 

#define TRACE 

#define I TRACE 

#define DTRACE 

#def ine CTRACE 



_SEMCOM trace_level ; 

_SEMCOM ini tfg; 

if (_SEMCOM trace_level & SEMCOM_TRACE_ON)\ 

_SEMCOM_ trace 
if (_SEMCOM_trace_ level & SEMCOM_TRACE_INT)\ 

_SEMCOM_ trace 
if (_SEMCOM_trace_level & SEMCOM_TRACE_DEBUG)\ 

_SEMCOM_t race 
if (_SEMCOM_trace_level & SEMCOM_TRACE_COMPILE)\ 

_SEMCOM_t race 



#define ERROR(msg) 
#def ine ABORT(msg) 

#define CHECKINIT 
#define ENTER 



_SEMCOM_t race( "Error : rnsg") 
(_SEMCOM_t race ("ABORT: msg"), abort()) 

if ( !_SEMCOM__initfg) SEMCOMinit() 
CHECKINIT; TRACE 
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/; */ 

/* Miscellaneous definitions */ 

/* */ 
/»»**»**»***»»******»**»*******»»****>.**»♦**«**»*»** **»*»*»****»**»»»»***/ 

#define CUR_NAME (_SEMCOM cur->SEMCOM_name) 

#define FIRSTSTMT (_SEMCOM__cur->SEMCOM_f i rst ) 

#define ACTIVE (_SEMCOM_cur->SEMCOM_act i ve) 



/* */ 

/* Variable definitions »/ 

/* */ 



extern ASHJVINDOW SEMCOM window; 

extern Boolean SEMCOM auto_recomp; 

extern SEMCOM_COMPILATION_TYPE SEMCOM comp_type; 



/• v 

/* Local definitions from semcommai n . c */ 

/* */ 



extern SEMCOM_NAME_INFO _SEMCOM cur; 

extern _SEMCOM_t race() ; 

extern _SEMCOM_dump() ; 

extern _SEMCOM_set_cur rent_name() ; 

extern _SEMCOM_reset_cur rent_name() ; 

extern _SEMCOM_set_current_node() ; 



/♦ V 

/* Local definitions from semcomstmt . c */ 

A */ 



extern _SEMCOM_stmt_i n i t ( ) ; 

extern Boolean SEMCOM_test_f or() ; 

extern Boolean SEMCOM_test_de l_ok() ; 
extern _SEMCOM_rep I ace_l i st () ; 

extern _SEMCOM_remove_l i st () ; 

extern SEMCOM_f orce_compi I at ion() ; 

extern _SEMCOM_stmt_f ree() ; 

extern SEMCOM_STMT _SEMCOM_f i ndprev i ous() ; 
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A •/ 

/* Local definitions from semcomeval.c */ 

/* */ 

/** ****** ****************************************************************/ 



extern _SEMCOM_eva l_i ni t () ; 

extern SEMCOM_change_i ncrementa I () : 

extern SEMCOM_change_procedure() ; 

extern SEMCOM_change_comp I ete() ; 

extern SEMCOM_ remove () ; 



/A***********************************************************************/ 
/* */ 

/* Local definitions from semcomexec.c */ 

/* ,/ 

/* *********************************************************** ************/ 



extern _SEMCOM_exec_i ni t () ; 

extern _SEMCOM_set_current () ; 

extern _SEMCOM_unexecute() ; 

extern _SEMCOM_execute() ; 

extern _SEMCOM_get_cur rent s() ; 

extern SEMCOM_f ree_va I ue() ; 



/************************************************ ************************/ 
/* */ 

/* Local definitions from semcomwi ndow.c */ 

/* */ 

/************************************************************************/ 



extern SEMCOM_wi ndow_i ni t () ; 

extern SEMCOM_scrol I () ; 

extern SEMCOM_c I ear_screen() ; 

extern SEMCOM_record() ; 



/************************************************************************/ 

/* •/ 

/* Local definitions from semcombut ton . c */ 

/• */ 

/*********************************************************************»**/ 



extern SEMCOM_but ton_compi I e( ) ; 

extern SEMCOM_but ton_i ncrementa I () ; 

extern SEMCOM_but ton_procedure( ) ; 

extern SEMCOM_but ton_compl ete( ) ; 

extern SEMCOM_but ton_auto() ; 

extern SEMCOM_but ton_top() ; 

extern SEMCOM_but ton_bot tom() ; 

extern SEMCOM_but ton_scrol l_up() ; 

extern SEMCOM_but ton_scrol l_down(); 

extern SEMCOM_but ton_up() ; 

extern SEMCOM_but t on_down() ; 

extern SEMCOM_but ton_c I ear() ; 

extern SEMCOM_but ton_scrol I () ; 



/* end of semcom_l oca I . h */ 
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B.5. Abridged Program Listing: semcomstmt.c 



/************************************************************************/ 
/• •/ 

/* semcomstmt.c */ 

/* */ 

/* Routines for statement list maintenance in symbol compiler */ 

/* »/ 

/************************************************* ******* * * ********* * * * * * / 



^include "semcom_l oca I ,h" 
# include <sem_reader . h> 



y *********** **************************** *********************************/ 

/* */ 

/* Local storage */ 

/* ♦/ 



SEMCOM_NAME_INFO _SEMCOM cur; 



static SEMCOM_STMT 
static SEMCOM STMT 



f reel i st = NULL; 
f i rststmt = NULL; 



static SEMCOM_STMT 

static SEMCOM_STMT 

static SEMCOM_STMT 

static SEMCOM STMT 



proc_old_hd = NULL 

proc_old_t I = NULL 

proc_new_hd = NULL 

proc_new_tl = NULL 



static SEMCOM_STMT proc_after = NULL; 



static ASPEN_NODE 
static ASPEN_NODE 
static ASPEN NODE 



current_node = NULL; 
prev i ous_node = NULL; 
current_procedure = NULL; 



/***** ***************************************************************** **/ 

A V 

/* Miscellaneous definitions */ 

/• */ 

/** ************************************************* ******* **************/ 



#define CUR_MAX_STMT 2 
#def ine BLOCK 19 



/*••****•***•***•******* *********************************************** **y 

/* */ 

/* Forward Definitions */ 

/* */ 

/***********»****************»*******************************************^/ 



stat ic 

static SEMCOM_STMT 

static SEMCOM_STMT 

static SEMCOM_STMT 

static 

static ASPEN_NODE 

static ASPEN_NODE 

static i nt 



semcom_new_node() ; 
stmt_l i st ( ) ; 
eva l_do_stmt () ; 
new_stmt () ; 
copy_l i st () ; 
enclosing_block() ; 
enclosing_program(); 
varcmp( ) ; 
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/**»**»*************»»***»**»*****♦****»*»***«*»********«*********»»*****/ 

/• •/ 

/* _SEMCOM_stmt_i ni t — initialize module ♦/ 

/• */ 



_SEMCOM_stmt_ini t() 

f 

I TRACE( "_SEMCOM_s tmt_ i n i t " ) ; 

f reel ist = NULL; 
f i rststmt = NULL; 

PLUMaccept_even t ( semcom_new_node , "SDE_$CURRENT" ) ; 

i; 



Not listed - SEMCOMsuggesiJLext 

Not listed - SEMCOMtest_begin 

Not listed - SEMCOM_test_for 

Not listed - SEMCOM_test_del_ok 



A ./ 

/* _SEMCOM_replace_l i st — replace the statement list for a node */ 
/* */ 



_SEMCOM_replace_l ist(n) 
ASPEN_NODE n; 

\ 

register SEMCOM_STMT s; 
SEMCOM_STMT b , o I ds , o I dt I , hd , t I ; 

ITRACE("_SEMCOM_replace_l ist 0x%x",n) ; 

switch (SEMCOM comp_type) \ 

case SEMCOM_COMP_INCREMENTAL: 

b = _SEMCOM_f indprevious(n) ; 

ASPEN i nq_semant ics(n,i:olds,&oldtl); 

ASPENset_semant ics(n, NULL, NULL) ; 

if (ACTIVE) stmt_l ist(n.NULL); 

ASPEN inq_semant i cs(n ,&hd ,&t I ) ; 

SEMCOM_change_i ncrementa l(b,olds,oldtl ,hd,tl); 

break; 
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caseS EMCOM_COMP_PROC EDUR E : 

if (enclosi ng_block(n, FALSE) = cur rent_procedure) j 

ASPEN i nq_semant ics(n,&olds,&oldt I ) ; 

ASPENset_semant i cs(n, NULL, NULL) ; 

if (ACTIVE) stmt_l ist(n.NULL) ; 

ASPEN i nq_semant ics(n ,&hd,*t I ) ; 

if ((hd != NULL) && (olds != NULL)) \ 
hd->SEMCOM_last = ol ds->SEMCOM_l ast ; 
if (oids->SEMCOM_last != NULL) 

olds->SEMCOM_last->SEMCOM_next = hd; 

I; 

if ((tl != NULL) kSc (oldtl != NULL)) 1 
t l->SEMCOM_next = ol dt l->SEMCOM_next ; 
if (oldt l->SEMCOM_next != NULL) 

oldt l->SEMCOM_next->SEMCOM_last = tl; 

I 

i 

e I se { 

if ((current_procedure != NULL) ScSc 
(proc_new_hd != NULL) ic& 
(proc_new_tl != NULL) Icic 
(proc_old_hd != NULL) && 
(proc_old_t I != NULL)) { 
SEMCOM_record( "Moved out of previous procedure,"); 
SEMCOM_record 

(" forcing compilation of previous procedure"); 
SEMCOM_f orce_compi I at ion() ; 
J 

cur rent_procedure = enc I os i ng_b I ock(n,TRUE) ; 

proc_after = _SEMCOM_f i ndprev i ous(cur rent_procedure) ; 

ASPEN i nq_semant i cs(cur rent_procedure ,icproc_new_hd, 
&proc_new_t I ) ; 

copy_l i st (proc_new_hd, proc_new_t I ,&proc_ol d_hd,&proc_ol d_t I ) ; 

ASPEN i nq_semant ics(n ,&olds,&oldt I ) ; 

ASPENset_semant ics(n, NULL, NULL) ; 

if (ACTIVE) stmt_l ist(n.NULL) ; 

ASPEN i nq_semant i cs(n ,&hd ,&t I ) ; 

if ((hd != NULL) ScSc (olds != NULL)) \ 
hd->SEMCOM_last = o I ds->SEMCOM_l ast ; 
if (olds->SEMCOM_last != NULL) 

olds->SEMCOM_last->SEMCOM_next = hd ; 

h 

if ((tl != NULL) && (oldtl != NULL)) { 
t l->SEMCOM_next = o I dt l->SEMCOM_next ; 
if (oldt l->SEMCOM_next != NULL) 

oldt l->SEMCOM_next->SEMCOM_last ■ tl; 
} 
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if (SEMCOM auto_recomp) { 

SEMCOM_change_procedure(proc_af ter , proc_ol d_hd, proc_old_t I 

proc_new_hd, proc_new_t I ) ; 
proc_old_hd = NULL; 
proc_old_t I = NULL; 
proc_new_hd = NULL; 
proc_new_hd = NULL; 
current_procedure = NULL; 
proc_after = NULL; 



e I se 

SEMCOM_change_procedu re (NULL .NULL .NULL, NULL, NULL) ; 

break ; 

case SEMCOM_COMP_COMPLETE: 

ASPEN i nq_semant ics(n,&olds,&oldtl); 

ASPENset_semantics(n, NULL, NULL); 

if (ACTIVE) stmt_l ist(n.NULL); 

ASPEN i nq_semant i cs(n,4:hd ,&t I ) ; 

if ((hd != NULL) kSc (olds ! = NULL)) J 
hd->SEMCOM_last = o I ds->SEMCOM_l ast ; 
if (olds->SEMCOM_last != NULL) 

olds->SEMCOM_last->SEMCOM_next = hd; 

I; 

if ((tl != NULL) Ictc (oldtl != NULL)) j 
t l->SEMCOM_next = ol dt l->SEMCOM_next ; 
if (oldt l->SEMCOM_next != NULL) 

oldt l->SEMCOM_next->SEMCOM_last = tl; 

i; 

SEMCOM_retnove(olds,oldt I ) ; 
if (SEMCOM auto_recomp) \ 

n = enc I osi ng_program(n ,TRUE) 

ASPEN i nq_semant i cs(n ,&hd,&t I ) 

SEMCOM_change_comp I ete(hd, t I ) 
I 

e I se 

SEMCOM_change_complete(NULL,NULL) ; 

break; 

h 

if (_SEMCOM trace_level & SEMCOM_TRACE_DUMP) _SEMCOM_dump() ; 
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A */ 

/* _SEMCOM_remove_l i st — remove the statement list for a node */ 

/• •/ 

/ft***********************************************************************/ 



_SEMCOM_ remove. I ist(n) 
ASPEN_NODE n; 



I 



register SEMCOM_STMT s; 
SEMCOM_STMT b , o I ds , o I dt I , hd , t I ; 

ITRACE("_SEMCOM_remove_l ist 0x%x",n); 

switch (SEMCOM comp_type) { 

case SEMCOM_COMP_INCREMENTAL: 

b = _SEMCOM_f i ndprev i ous(n) ; 

ASPEN i nq_semant ics(n,icolds,4:oldtl); 

ASPENset_semant ics(n, NULL, NULL) ; 

SEMCOM_change_incremental (b,o I ds.ol dt I , NULL, NULL); 

break ; 

case SEMCOM_COMP_PROCEDURE: 

if (enc losi ng_block(n, FALSE) = current_procedure) { 

ASPEN i nq_semant i cs(n ,&o I ds ,&ol dt I ) ; 

ASPENset_semant i cs(n, NULL, NULL) ; 

if ((olds != NULL) && (olds->SEMCOM_last != NULL) kit 
(oldt I != NULL)) 
olds->SEMCOM_last->SEMCOM_next = o I dt l->SEMCOM_next ; 

if ((oldt I != NULL) icSc (ol dt l->SEMCOM_next != NULL) Ictc 
~ (olds ! = NULL)) 

oldt l->SEMCOM_next->SEMCOM_last = o I ds->SEMCOM_ I ast ; 

SEMCOM_ remove (o Ids, oldt I ) ; 

I 

e I se j 

if ( (current_procedure != NULL) && 



(proc_new_hd 
(proc_new_t I 
(proc_ol d_hd 
(proc_ol d_t I 



NULL) && 

NULL) lc!c 

NULL) Sck 

!= NULL)) | 

SEMCOM_ record ("Moved out of previous procedure,"); 
SEMCOM_record 

(" forcing compilation of previous procedure"); 
SEMCOM_f orce_compi I at ion() ; 
I 

current_procedure = enc I os i ng_b I ock(n ,TRUE) ; 

proc_after = _SEMCOM_f i ndprev i ous(current_procedure) ; 

ASPEN i nq_semant i cs(cur rent_procedure,&proc_new_hd , 
&proc_new_t I ) ; 

copy_l i st (proc_new_hd , proc_new_t I ,&proc_o I d_hd ,&proc_o I d_t I ) ; 

ASPEN i nq_semant ics(n,&olds,&oldtl); 



h 
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ASPENset_semant ics(n, NULL, NULL) ; 

if ((olds != NULL) &lc (ol ds->SEMCOM_l ast != NULL) && 
(oldt I != NULL)) 
olds->SEMCOM_last->SEMCOM_next = oldt l->SEMCOM_next ; 

if ((oldt I != NULL) &lc (ol dt l->SEMCOM_next != NULL) IcSc 
(olds != NULL)) 
oldt l->SEMCOM_next->SEMCOM_last = olds->SEMCOM_l ast ; 

SEMCOM_remove(o Ids, oldt I ) ; 



if (SEMCOM auto_recomp) \ 

SEMCOM_change_procedure(proc_af ter , proc_old_hd, proc_ol d_t I 

proc_new_hd , proc_new_t I ) ; 
proc_old_hd = NULL; 
proc_old_t I = NULL; 
proc_new_hd = NULL; 
proc_new_hd = NULL; 
cur rent_procedure = NULL; 
proc_after = NULL; 



e I se 

SEMCOM_change_procedu re (NULL, NULL, NULL, NULL, NULL) ; 

break; 

case SEMCOM_COMP_COMPLETE: 

ASPEN i nq_semant ics(n,&olds,&oldtl); 

ASPENset_semant ics(n, NULL, NULL) ; 

if ((olds != NULL) Ick (ol ds->SEMCOM_l ast != NULL) k& 
(oldt I != NULL)) 
olds->SEMCOM_last->SEMCOM_next = o I dt l->SEMCOM_next ; 

if ((oldt I != NULL) && (o I dt l->SEMCOM_next != NULL) && 
(olds != NULL)) 
oldt l->SEMCOM_next->SEMCOM_last - ol ds->SEMCOM_l ast ; 

SEMCOM_remove(o Ids, oldt I ) ; 

if (SEMCOM auto_recomp) J 

n = enc I osi ng_program(n ,TRUE) ; 

ASPEN i nq_semant i cs(n,4chd,&t I ) ; 

SEMCOM_change_compl ete(hd, t I ) ; 
I 

el se 

SEMCOM_change_complete(NULL,NULL); 
break ; 

h 

if (_SEMCOM_trace_level k SEMCOM_TRACE_DUMP) _SEMCOM_dump() ; 
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/* */ 

/* SEMCOM_f orce_compi I at ion */ 

/* */ 

SEMCOM_f orce_compi I at ion() 

I 

ASPEN_NODE n; 
Boolean temp; 
SEMCOM_STMT o I ds , o I dt I , b , hd , t I ; 

ITRACE("SEMCOM_force_compi lat ion") ; 

temp = SEMCOM_aut o_recomp ; 
SEMCOM auto_recomp = TRUE; 

switch (SEMCOM comp_type) | 

case SEMCOM_C0MP_INCREMENTAL: 

SEMCOM_record( "At tempt to force compilation,"); 
SEMCOM_record(" but compilation is set to INCREMENTAL"); 

break; 

caseS EMCOM_COMP_PROC EDUR E : 

SEMCOM_change_procedure(proc_af ter ,proc_ol d_hd, proc_ol d_t I , 

proc_new_hd, proc_new_t I ) ; 

proc_old_hd = NULL; 
proc_old_t I = NULL; 
proc_new_hd = NULL; 
proc_new_hd = NULL; 
cur rent_procedure = NULL; 
proc_after = NULL; 

break ; 

case SEMCOM_COMP_COMPLETE: 

n = enc I os i ng_program(cur rent_node .TRUE) ; 

ASPEN i nq_semant i cs(n ,&hd ,&t I ) ; 

SEMCOM_change_complete(hd, t I ) ; 

break ; 

I; 

SEMCOM auto_recomp = temp; 



Not listed - _SEMCOM_stmt_free 
Not listed - _SEMCOM_findprevious 
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/*•*•**•******•****•************•*******•************•*******************/ 

A ♦/ 

/* semcom_new_node — set last focus as current node has changed */ 

A •/ 

/*•*****•***••••*•******••*•*********•********••**••**************•***** */ 



static 

semcom_new_node(evt , act , node , name, id) 

St r i ng evt ; 

PLUM_EVENT_ACTION act; 

ASPEN_NODE node; 

String name; 

Integer id; 



I 



DTRACE(" seme om_new_n ode") ; 

if (act != PLUM_EVENT_DO) return; 

previ ous_node = cur rent_node; 
current node = node; 



Not listed - stmtjist 
Not listed - eval_do_stmi 
Not listed - new_stmt 



/*••***•**•*••**********•**••*•********************••********************/ 
A V 

/* copy_list — copy an existing list of SEMCOM statements into */ 
/* a new list */ 

A ♦/ 

/******* ************************************** *** ********************** ** / 



stat i c 

copy_l i st (ol d_hd, o I d_t I , new_hd , new_t I ) 

SEMCOM_STMT old_hd,old_t I , *new_hd, *new_t I ; 

\ 

register SEMCOM_STMT s.ls; 

DTRACE("copy_l ist 0x%x 0x%x" ,ol d_hd ,o I d_t I ) ; 

*new_hd = NULL; 
*new_t I = NULL; 
s = NULL; 
Is = NULL; 

whi le (old_hd != NULL) } 

if (free I ist = NULL) s = ALLOC(SEMCOM_STMT) ; 
e I se j 

s = f ree list; 

free I ist = s->SEMCOM_next ; 

h 

if («new_hd ■=■ NULL) «new_hd = s; 

s->SEMCOM_next = NULL; 

s->SEMCOM_last = Is; 

s->SEMCOM_type = ol d_hd->SEMCOM_type ; 

s->SEMCOM_ index = ol d_hd->SEMCOM_i ndex; 

s->SEMCOM_node = ol d_hd->SEMCOM_node ; 

s->SEMCOM_value = o I d_hd->SEMCOM_va I ue ; 
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if (Is != NULL) ls->SEMCOM_next = s; 

I s = s ; 

old_hd = (old_hd = old_tl ? NULL : ol d_hd->SEMCOM_next ) ; 

} 

*new_t I = s; 



/* enc I osi ng_bl ock — returns the enclosing block node */ 



static ASPEN_NODE 
enclosing_block(n, print) 

ASPEN_N0DE n; 

Bool ean print; 



1 



ASPEN_N0DE x; 

register Integer count; 

DTRACE("enclosing_block 0x%x" , n) ; 

count = 0; 
x = n; 

whi le ((x != NULL) && 

(ASPENinq_rule(x) != BLOCK)) { 
x = ASPENi nq_parent (x) ; 
-H-count ; 
if (x != NULL) n = x; 



if ((x != NULL) kk (print)) 

SEMCOM_record( "Found enclosing block after %d steps" .count ) ; 

return n; 
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/ft*********************************************************************** / 

/* •/ 

/* enc I os i ng_program — returns the enclosing program node */ 

/* */ 



static ASPEN_NODE 

enc I os i ng_program(n, print) 

ASPEN_NODE n; 

Boo I ean print; 



\ 



ASPEN_NODE x; 

register Integer count; 

DTRACE( "enc I os i ng_program 0x%x" , n) ; 

count = 0; 

x = n; 

whi le (x != NULL) \ 

x = ASPENi nq_parent (x) ; 
-H-count ; 
if (x != NULL) n = x; 

i; 

if (print) 

SEMCOM_ record ("Found enclosing program after %d steps" .count ) ; 

return n; 



Not listed - varcmp 



/* end of semcomstmt.c */ 
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B.6. Abridged Program Listing: semcomeval.c 

/• •/ 

/* semcomeval.c »/ 

/• •/ 

/* Routines for statement list evaluation in symbol compiler */ 

/* */ 

/******»*********»**»***********«*********»*»***»**»*******»♦****»*♦*****/ 



^include "semcom_loca I . h' 
§ include <sem_ reader . h> 



/* */ 

/* Local storage definitions */ 

/* ' */ 

/ft*********************************** ********************** ****♦******#*/ 



Boolean SEMCOM^auto_recomp ; 

SEMC0M_C0MPI LATION_TYPE SEMCOM_comp_type ; 



/*********♦***»****»********♦*******♦»****«»*****»»***»**#***»**»********/ 

A V 

/* Miscellaneous definitions */ 

/* */ 



typedef enum | 

EXTEND_STATE_INIT, 
EXTEND_STATE_OLD, 
EXTEND_STATE_NEW, 
EXTEND_STATE_SCAN 

\ EXTEND_STATE; 



/* */ 

/* Forward Definitions */ 

/* */ 



static undo_execut i on( ) ; 

static do_execut i on() ; 

static head_merge( ) ; 

static ta i l_merge( ) ; 

stat i c extend( ) ; 

static updateneedsQ ; 

stat i c i nsert () ; 

static Boolean teststmtmatch( ) ; 
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/***************************************************** *******************/ 

A •/ 

/* _SEMCOM_eva l_i ni t — initialize module */ 

/* »/ 

/******************•****************************#*******♦********+*******/ 



_SEMCOM_eval_init() 

i 

return; 



/*v************»*«* ft****************** ******«•*****•********•*•**»•»*****/' 

/* •/ 

/* SEMCOM_change_i ncrementa I — change a portion of the statement */ 

/» list */ 

/• •/ 



SEMCOM_change_i ncrementa I (after, old hd, ol dt I , newhd, newt I ) 
SEMCOM_STMT after; 
SEMCOM_STMT oldhd.oldtl; 
SEMC0M_STMT newhd , newt I ; 



5 



ITRACE("SEMCOM_change_i ncrementa I 0x%x 0x%x 0x%x 0x%x 0x%x", after, 

ol dhd , ol dt I , newhd, newt I ) ; 
if (ISEMCOM auto_recomp) { 

SEMCOM_record("Incrementa I compilation attempted,"); 

SEMCOM_record(" but automatic recompi I at i on is OFF"); 

I 

e I se { 

CTRACE(" Begin change after 0x%x" , after) ; 

SEMCOM_record 

( » . ) . 

SEMCOM_record("INCREMENTAL COMPILATION"); 

head_merge(&af ter , &o I dhd ,&ol dt I , ftnewhd , Scnewt I ) ; 
ta i l_merge(Jcaf ter ,Sco I dhd,&o I dt I , fcnewhd ,8cnewt I ) ; 
extend(&af ter ,&ol dhd,&ol dt I , fcnewhd , fcnewt I ) ; 

SEMCOM_remove(oldhd,oldt I); 
insert(after, newhd, newt I ) ; 

CTRACE(" End change\n\n") ; 

SEMCOM_record 

( " ■■ ) . 



I; 
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/* »/ 

/* SEMCOM_change_procedure — change enclosing procedure */ 

/' ♦/ 



SEMCOM_change_procedure(af ter ,ol dhd.oldt I , newhd , newt I ) 
SEMCOM_STMT af ter ,ol dhd.o I dt I .newhd.newt I ; 



\ 



h 



ITRACE("SEMCOM_change_procedure 0x%x 0x%x 0x%x 0x%x 0x%x" .after, 

ol dhd.ol dt I , newhd , newt I ) ; 

if (ISEMCOM auto_recomp) { 

SEMCOM_record( "Procedure compilation attempted,"); 

SEMCOM_record(" but automatic recompi I at i on is OFF"); 
} 
e I se } 

CTRACE(" Begin change"); 

SEMCOM_ record 

(" ). 

SEMCOM_ record ("PROCEDURE COMPILATION"); 

undo_execut ion(oldhd,oldt I ) ; 
do_execut i on (after , newhd , newt I ) ; 

CTRACE(" End change\n\n") ; 

SEMCOM_record 

(" ") ; 

h 



/• */ 

/* SEMCOM_change_compl ete — recompile entire program */ 



SEMCOM_change_compl ete(hd, t I ) 

SEMCOM_STMT hd,t I ; 
\ 

ITRACE("SEMCOM_change_complete 0x%x 0x%x" , hd , t I ) ; 

if (ISEMCOM auto_recomp) \ 

SEMCOM_record("Compl ete recompi I at i on attempted,"); 
SEMCOM_record(" but automatic recompi I at i on is OFF"); 

i 

else { 

CTRACE(" Begin change"); 

SEMCOM_record 

(" ); 

SEMCOM_record( "COMPLETE RECOMPI LAT ION") ; 

undo_execut i on(hd , t I ) ; 
do_execut i on (NULL, hd , t I ) ; 

CTRACE(" End change\n\n") ; 

SEMCOM_ record 

( " ) . 
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Not listed - SEMCOM remove 



/* »/ 

/* undo_execut i on — unexecute statement list */ 

/• */ 

/************************************************************************/ 



stat ic 

undo_execut i on(hd , t I ) 
SEMCOM_STMT hd , t I ; 



I 



register Integer ct; 

DTRACE("undo_execut ion 0x%x 0x%x" , hd, t I ) ; 

ct = 0; 

whi le (t I != NULL) j 

_SEMCOM_unexecute(t I ); 

-H-ct ; 

tl = (hd = tl ? NULL : t l->SEMCOM_last ) ; 

h 

CTRACE("Undo %d",ct); 
SEMCOM_record("Undo %d",ct); 



/**************************** ********************************************/ 

/* */ 

/* do_execution — execute statement list */ 

/* •/ 

/**************** ******** ****************************************** ******/ 



static 

do_execut i on (after, hd,t I ) 

SEMCOM_STMT after; 

SEMCOM_STMT hd , t I ; 

5 

register Integer ct; 

DTRACE("do_execut ion 0x%x 0x%x 0x%x" ,af ter , hd , t I ) ; 

_SEMCOM_set_current(NULL) ; 
_SEMCOM_set_current(af ter) ; 

ct - 0; 

whi le (hd != NULL) \ 

_SEMCOM_execute(hd) ; 

-H-ct ; 

hd = (hd = tl ? NULL : hd->SEMCOM_next ) ; 



h 



CTRACE("Do %d",ct); 
SEMCOM_ record ("Do %d",ct); 
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Not listed - headjmerge 

Not listed - tailjmerge 

Not listed - extend 

Not listed - updateneeds 

Not listed - insert 

Not listed - teststmtmatch 

/* end of semcomeva I . c */ 
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B.7. Program Listing: semcomwindow.c 

/ ************* * *************************** * *** * »*»**** * **** * • * *** * ** • *** * / 
/* */ 

/* semcomwindow.c */ 

/• */ 

/* Window routines for incremental symbol compiler */ 

/• ' ' •/ 

/**•************************************************************„,**,****/ 

/* James Popple September/October 1987 */ 

§ include "semcom_l oca I . h" 



/ ******** * ****************************** * • ************ ** * ** * * ** * ** * * * *** */ 

/* »/ 

/* Local storage definitions */ 

/* */ 

/**********************************************»*************************/ 



ASH_W I NDOW S EMC0M_w i n d ow ; 

Boolean SEMCOM auto recomp: 

SEMCOM_COMPI LATION_TYPE SEMCOM_comp_type ; 

static Integer SEMCOM vt id = — 1 ; 

static Boolean eolfg = TRUE; 

static Integer semcom_font = 0; 

static Integer num_lines = 0; 

static Integer num_cols = 0; 



/*********************************************************»************* * / 

/* V 

/* Forward Definitions */ 

A" •/ 

/* ********************************************************************** * / 



static new_semcom_wi ndow() ; 

static semcom_cont ro I () ; 

static set up_semcom_wi ndow() ; 

static remove_semcom_wi ndow() 

static semcom_def i ne_scrol I () ; 
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A V 

/* Window Definitions */ 

A ./ 



static WILLOWJ3EFN semcom_wi ndow = j 
WILLOW_CLASS_USER, 

{ "SEMCOM", "pecan. icons", "C \, 
} 300,100, 800,1024, 400,300, 1, 1 j, 
{ ASH_WINDOW_HIT_PARENT, 
WILLOW_TITLE_TAB_SENSE, 
WILLOW_INSTANCE_SAVED_1 j, 
new_semcom_wi ndow, 
NULL, 
| | { "COMPILE" |, 

WI LLOW_LOCATION_TAI L, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_compi le) j , 
| \ "INCREMENTAL" j, 
WILLOW_LOCATION_TAIL, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_incremental ) \ , 
\ \ "PROCEDURE" }, 

WI LLOW_LOCATION_TAI L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_procedure) j , 
J J "COMPLETE" }, 

W I L LOW_ LOC AT I ON_TA I L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_complete) \ , 
1 1 " AUTO ",NULL,0,0,"MANUAL"j, 
WI LLOW_LOCATION_TAI L , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_auto) | , 
i { "TOP" |, 

WILLOW_LOCATION_BOTTOM, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_top) \ , 
| J "BOTTOM" I , 

W I L LOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_bottom) { , 
{ | "SCROLL UP" i , 

WI LLOW_LOCATION_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_scrol l_up) \, 
i \ "SCROLL DOWN" j , 

WI LLOW_LOCATION_BOTTOM, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_scrol i_down) \ , 
5 1 "UP" | , 

WI LLOW_LOCATION_BOTTOM, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACT10N_USER(SEMCOM_button_up) j , 
| { "DOWN" \ , 

WILLOW_LOCATION_BOTTOM, 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_down) \ , 
\ \ "CLEAR" \ , 

W I L LOW_ LOCAT I ON_BOTTOM , 
WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_USER(SEMCOM_button_clear) \ , 
I 1 "SCROLL" | , 

WlLLOW_LOCATION_R, 

WILLOW_BUTTON_REGION, 

WILLOW_ACTION_USER(SEMCOM_button_scrol I) \, 



no 



\ J "Move", "pecan . i cons" , "1', 0, 0, 1 }, 

W I L LOW_ LOC AT I ON_U L , 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_MOVE(DEFAULT) \, 
\ { "Size", "pecan . icons" , '0', 0, 0, 1 }, 

W I L LOW_ LOCAT I ON_UR , 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_TYPE_AUX9, NULL, WILLOW_ACTION_DEFAULT }, 
{ \ "Remove" \ , 

WILLOW_LOCATION_TITLE, 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_ICON(DEFAULT) f, 
| \ "Push" }, 

W I L LOW_ LOCAT I ON_T I T LE , 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACTION_PUSH(DEFAULT) \, 
{ \ "Pop" \, 

WILLOW_LOCATION_TITLE, 

WI LLOW_BUTTON_NORMAL , 

WILLOW_ACT I ON_POP( DEFAULT) j 



A */ 

/* SEMCOM_wi ndow_i ni t — initialize semcom window */ 

/* */ 



SEMCOM_w i ndow_ i n i t ( ) 

t 

ITRACE("SEMCOM_window_init"); 

SEMCOM_window = NULL; 
SEMCOM_vt id = NULL; 

SEMCOM auto_recomp = TRUE; 

SEMCOM comp_type = SEMCOM_COMP_INCREMENTAL; 

eolfg = TRUE; 
num_! i nes = 0; 
num_cols = 0; 



WI LLOWdef i ne_wi ndow(&semcom_wi ndow) ; 

h 



m 



/ft***********************************************************************/ 
/• •/ 

/* SEMCOM_scrol I — scroll VT window */ 

/* */ 



SEMCOM_scrol l(dl ,dc,abs) 
Integer d I , dc ; 
Integer abs; 

I 

Integer r I , re ; 
Integer c I ,cc ; 
register Integer b; 

ITRACE("SEMCOM_scrol I %6 %d %d" ,d I ,dc , abs) ; 

VT$PUSH(SEMCOM_vt id); 
VT$NO_SCROLL; 
VT$INQ_REGION(&rl ,trc); 

if (dl != 0) \ 

rl += d I »(num_l i nes/4) ; 

i 

else if (dc != 0) | 

re += dc*(num_col s/4) ; 

I 

e I se { 

VT$INQ_CURRENT(&cl .Jecc); 

ITRACE("\tscrol I absolute %d %d %d %d %d" , r I , re ,c I , cc,abs) ; 

b = MAX(c I , r l+num_l i nes) ; 

b = abs*b/100-num_l i nes/2; 

if (b < 0) b = 0; 

else if (b>cl) b=cl; 

rl = b; 

re = 0; 



ITRACE("\tscrol I to %d %d",rl,rc); 

VT$REGION(rl , re); 
VT$POP; 

semcom_def i ne_scro I I ( ) ; 



/ft***********************************************************************/ 

/* */ 

/* SEMCOM_c I ear_screen — clear VT window */ 

/* */ 



SEMCOM_c I ea r_sc reen ( ) 
f 

ITRACE("SEMCOM_clear_screen"); 

VT$PUSH(SEMCOM vtid); 

VT$MOVE(0,0); 
VT$ERASE_SCREEN; 
VT$POP ; 

SEMCOM_scrol I (0,0,0); 
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/♦ */ 

/* SEMCOM_record — put message in transcript for SEMCOM window */ 

/* */ 



SEMC0M_record(msg,a1 ,a2,a3) 
String msg; 
Un i versa I a1 ,a2, a3; 

! 

Charocter buf [256] , buf 1 [256] ; 

if (SEMCOM window = NULL) return; 

ITRACE("SEMCOM_record %s" ,msg) ; 

VT$PUSH(SEMCOM_vt id); 
VT$SCROLL; 
VT$FONT(semcom_font) ; 

if (ieolfg) VT$OUT("\n"); 

sprintf(buf ,msg,a1 ,a2,a3) ; 
sprintf (buf 1 , "%s\n",buf ) ; 
VT$OUT(buf1); 

#i f def VAX 

printf("%s",buf 1); 
#endif 

VT$POP ; 

eolfg = TRUE; 



/» */ 

/* new_semcom_wi ndow — set up semcom window */ 

/* ♦/ 



stat i c 
new_semcom_w i ndow( ) 

I 

register ASH_WINDOW w; 

DTRACE("new_semcorrr_wi ndow") ; 

w = ASHi nq_wi ndow() ; 

SEMCOM window = w; 

ASHset_cont rol (semcom_cont rol ) ; 

setup_semcom_wi ndow() ; 

h 
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/* */ 

/* semcom_cont rol — control message interpreter */ 

/• »/ 



stat i c 

semconi_cont rol (msg.w) 

St r i ng msg; 

ASH_WINDOW w; 



i 



DTRACE("semcom_cont rol %s 0x%x",msg,w); 

if (STREQL(msg,"PDS$NEXT")) return ASH_CONTROL_OK; 

if (STREQL(msg,"ASH$RESIZE")) } 
setup_semcom_wi ndow() ; 

i 

else if (STREQL(msg,"ASH$INQ_RESIZE")) { 
remove_semcom_wi ndow( ) ; 

I 

else if (STREQL(msg,"ASH$REMOVE")) \ 
remove_semcom_w i ndow( ) ; 
SEMCOM window = NULL; 

I; 

return ASH_CONTROL_REJECT; 



/* V 

/* setup_semcom_wi ndow — set up semcom window */ 

/• */ 

/»****************»*****♦»**********»»»********»»****»*******»****»»♦**»*/ 



stat i c 

set up_semcom_wi ndow() 

i 

DTRACE("setup_semcom_wi ndow") ; 

ASHpush_wi ndow( ) ; 

ASHse I ect( SEMCOM window) ; 

SEMCOM_vtid = VTopenQ; 
VT$PUSH ( SEMCOM_v t i d ) ; 
VT$SCROLL; 

semcom_font = VT$LOADFONT(SEMCOM_FONT) ; 
VT$FONT(semcom_font) ; 
VT$INQ_SIZE(inum_l i nes ,&num_col s) ; 

ASHse t_wi ndow_name( "Compi I at i on mon i tor") ; 

VT$POP ; 

ASHpop_w i ndow( ) ; 

semcom_def ine_scroll(); 



J: 
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/»»**»****»»***********«♦»***»**»»»********»**»*»*»»*»«»*»***»»»»»»»»»*»»/ 

A */ 

/* remove_semcom_wi ndow — remove semcom window */ 

/* */ 



stat i c 
remove_semcom_wi ndow() 

i 

DTRACE(" remove_semcom_wi ndow") ; 

VTc lose (SEMCOM vt id) ; 

i; 



/***********»*****»*»************»*********»*********»*»**»♦****»»»**»**»/ 
/* ♦/ 

/* semcom_def i ne_scrol I — define scroll region */ 

/* */ 



static 

semcom_def i ne_sc ro I I ( ) 

Integer r I , re ; 
Integer c I , cc ; 
register Integer b; 

DTRACE("semcom_def i ne_scrol I ") ; 

VT$PUSH ( S EMCOM_v t i d ) ; 
VT$INQ_CURRENT(&cl ,&cc) ; 
VT$INQ_REGION(&rl ,&rc); 
VT$POP ; 

b = MAX(c I , r l+num_l i nes) ; 

WILLOWbutton_feedback (SEMCOM wi ndow, "SCROLL" .TRUE, 

WI LLOW_SCROLL_REGION( r I * 100/b, 

( r l+num_l i nes)* 100/b) ) : 

I; 

/* end of semcomwi ndow.c */ 
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B.8. Program Listing: semcombutton.c 

/• */ 

/* semcombutton.c */ 

/• */ 

/* Button handling routines for incremental symbol compiler */ 

/• */ 

/* James Popple September/October 1987 */ 

§ include "semcom_l oca I . h" 



/ft***********************************************************************/ 

/» '/ 

/» SEMCOM_button_compi le — handle COMPILE button */ 

/* */ 

/********************************♦**************#*****************»****#*/ 



int 

SEMCOM_button_compi le(di r) 
W I L LOW_ ACT I ON_MOD E d i r ; 

\ 

ITRACE("SEMCOM_button_compi le %d",di r); 

if (dir != WILLOW_ACTION_DO) return; 
SEMCOM_f orce_compi I at i on() ; 
return TRUE; 

i: 



/***** ******** ********************************** *************************/ 

/* •/ 

/* SEMCOM_button_incremental — handle INCREMENTAL button */ 

/• */ 



int 

SEMCOM_but ton_i ncrementa I (d i r) 
W I L LOW_ACT I ON_MOD E d i r ; 

\ 

ITRACE("SEMCOM_button_i ncrementa I %d",di r); 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM comp_type = SEMCOM_COMP_INCREMENTAL; 

SEMCOM_record("Compi lat ion set to INCREMENTAL"); 
return TRUE; 

i: 
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/• •/ 

/* SEMCOM_button_procedure — handle PROCEDURE button */ 

/• */ 



int 

SEMCOM_but ton_procedure(di r) 
WILLOW_ACTION_MODE dir; 

I 

ITRACE("SEMCOM_button_procedure %d" ,di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM_comp_type = SEMCOM_COMP_PROCEDURE; 

SEMCOM_record("Cornpi I at ion set to PROCEDURE"); 

return TRUE; 

h 



/***** ************** ***************** ******************************* *****/ 

/* */ 

/» SEMCOM_button_complete — handle COMPLETE button */ 

/♦ •/ 

/♦♦♦♦♦♦♦♦♦a**************************************************************/ 



int 

SEMCOM_button_complete(di r) 
WILLOW_ACTION_MODE dir; 



I 



ITRACE("SEMCOM_button_complete %d",di r); 
if (dir != WILLOW_ACTION_DO) return; 
SEMCOM_comp_type = SEMCOM_COMP_COMPLETE; 
SEMCOM_record("Compi lot ion set to COMPLETE"); 
return TRUE; 



I: 
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/***+********** ***** ***** + *******»#***.*************##*** *********** **#***/ 

/* */ 

/* SEMCOM_button_auto — handle AUTO on/off */ 

A »/ 



int 

SEMCOM_button_auto(di r) 
m LLOW_ACTION_MODE d i r ; 

5 

ITRACE("SEMCOM_button_auto %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM__auto_recomp = ISEMCOM auto_recomp; 

WILLOWbutton_feedback(SEMCOM_window," AUTO ", ISEMCOM auto_recomp,0) ; 

SEMCOM_record( "Automat i c recompi I at ion turned %s", 

(SEMCOM_auto_recomp ? "ON" : "OFF")); 

return TRUE; 



A •/ 

/* SEMCOM_button_top — handle TOP button */ 

/• */ 

/* ******* ******************** ********************************************/ 



int 
SEMCOM_button_top(di r) 

WI LLOW_ACT ION_MODE dir; 

\ 

ITRACE("SEMCOM_button_top %d",di r) ; 

if (dir != WILLOW_ACT10N_D0) return; 
SEMCOM_scrol I (0,0,0) ; 
return TRUE; 

h 



/****************************************************************** ******/ 

/* •/ 

/* SEMCOM_button_bottom — handle BOTTOM button */ 

/* */ 

/******** ****************************************************************/ 



int 

SEMCOM_button_bottom(di r) 
WI LLOW_ACTION_MODE dir; 

i 

ITRACE("SEMCOM_button_bottom %d",di r); 

if (dir != WILLOW_ACTION_D0) return; 
SEMCOM_scrol 1(0,0,100); 
return TRUE; 
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/ /»»«*t**»»tt»«t«««»*t»t«n*t*««»»«tttt»«*t««nn*«»»ttn<n*««»*>««ttn*t/ 

A V 

/* SEMCOM_button_scrol l_up — handle SCROLL UP button */ 

A V 



int 

SEMCOM_button_scrol l_up(di r) 
W I L LOW_ACT I ON_MOD E dir; 

ITRACE("SEMCOM_button_scrol l_up %d" ,di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM_scrol I (-4,0,0); 

return TRUE; 



/***»********************»*»*******»****»******»*»****♦»*»*»*»**»*»***»» */ 

/* */ 

/* SEMCOM_button_scrol l_down — handle SCROLL DOWN button »/ 

/* •/ 



int 

SEMCOM_button_scrol l_down(di r) 
WI LLOW_ACTION_MODE dir; 

ITRACE("SEMCOM_button_scrol l_down %d",di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM_scrol I (4,0,0) ; 

return TRUE; 
I: 



/* */ 

/» SEMCOM_button_up — handle UP button */ 

/* */ 



int 
SEMCOM_button_up(di r) 

W I L LOW_ACT I ON_MOD E d i r ; 

I 

ITRACE("SEMCOM_button_up %d" ,di r) ; 

if (dir != WILLOW_ACTION_DO) return; 

SEMCOM_scrol I (-1 ,0,0); 

return TRUE; 
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A •/ 

/* SEMCOM_button_down — handle DOWN button »/ 

A •/ 



int 

SEMCOM_button_down(di r) 
W I L LOW_ACT I ON_MOD E d i r ; 

\ 

ITRACE("SEMCOM_button_down %d",di r) ; 

if (dir != WILLOW. ACT ION_DO) return; 

SEMCOM_scrol 1(1,0,0); 

return TRUE; 

I; 



/* */ 

/* SEMCOM_button_clear — handle CLEAR button */ 

A ./ 



int 
SEMCOM_button_clear(di r) 

WI LLOW_ACTION_MODE dir; 
i 

ITRACE("SEMCOM_button_clear %d" ,di r) ; 

if (dir != WILLOW_ACTION_DO) return; 
SEMCOM_c I ea r_sc reen( ) ; 
return TRUE; 

U 



/*»***»****»*******»***»**»***»*♦****»*»»*******»»*»*»***♦*******»*»«****/ 
/* »/ 

/* SEMCOM_button_scrol I — handle scroll bar */ 

A •/ 



int 

SEMCOM_button_scrol I (di r) 
W I LLOW_ACT I ON_MODE dir; 

5 

ITRACE("SEMCOM_button_scrol I %d",di r); 
if (dir != WILLOW_ACTION_DO) return; 
SEMCOM_scrol I (0,0,WILLOWi nq_scrol l()); 
return TRUE; 



/* end of semcombut ton . c */ 
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Appendix C 
Test Programs 

The Pascal programs used for testing in §6.6 are listed in this appendix (§C.l to 
§C4). The program listings have been formatted by PECAN, using the formatting 
information included in the specification of Pascal (see §5.2). 

C.l. Program Listing: testi.p 

PROGRAM mat r i xproduct ( i nput .output ) ; 

j taken from [Find I ay 81], pages 200-201 j 

CONST 

n = 10; 

TYPE 

matrix = ARRAY [1 .. n , 1 . . n ] OF integer; 

VAR 

a, b, p : mat r i x ; 

PROCEDURE readmatrix (VAR m : matrix); 

VAR 

i , j : 1 . . n ; 

BEGIN { Procedure readmatrix { 
FOR i := 1 TO n DO 

FOR j := 1 TO n DO 
READ(m[i , j]) 
END; 

PROCEDURE writematrix (VAR m : matrix); 

VAR 

i , j : 1 . . n ; 

BEGIN j Procedure writematrix } 
FOR i := 1 TO n DO 
BEGIN 

WRITECt'); 
FOR j := 1 TO n DO 
WRITE(m[i , j]); 
WRITELN( ']') 
END 
END; 
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PROCEDURE mul t i plymat rices (ml , m2 : matrix; VAR product : matrix); 
VAR 



i , j , k : 1 . . n; 

sea I arproduct : integer; 



BEGIN \ Procedure mu I t i pi ymat r i ces } 
FOR i ;= 1 TO n DO 

FOR j := 1 TO n DO 
BEGIN 

sea I arproduct := 0; 
FOR k := 1 TO n DO 

seal arproduct := sea I arproduct+m1 [ i , k]*m2[k, j ] ; 
productfi.j] := sea I arproduct 
END 
END; 



BEGIN | Program mat r i xproduct { 
readmctrix(J m := \ a) 
readmatrix(J m := j b) 
mul t i plymat r i ces( \ ml 
wr i temat r i x( \ m := j p) 

END. 



= } a, J m2 := { b, \ product := j p) ; 



C.2. Program Listing: testz.p 

PROGRAM tableoftans (output); 

{ taken from [Findlay 81], pages 167-168 } 

CONST 

pi = 3.1415926536; 

VAR 

degrees : . . 360; 
line : . . 36; 

FUNCTION tan (x : real): real; 

\ no declarations \ 

BEGIN j Function tan \ 

tan := s i n(x)/cos(x) 
END; 



BEGIN J Program tableoftans \ 

WRITELN(' Angle' :5 , 'Tangent' :15 ); 
WRITELN(*») ; 
FOR I ine := TO 36 DO 
BEGIN 

degrees := 10* I i ne; 

WRITE(degrees:5 ); 

IF degrees MOD 180 = 90 THEN 

WRITELN(' Infinity' : 15 ) 
ELSE 

WRITELN(tan(J x := \ degrees»pi/180) : 15 ) 
END 
END. 
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C.3. Program Listing: test$.p 

PROGRAM factorial ( i nput .output ) ; 
} t rad i t i ona I \ 

VAR 

x : integer; 

FUNCTION factorial (n : integer): integer; 

{ no dec I arat i ons \ 

BEGIN | Function factorial } 
IF n = 1 THEN 

factor i a I := 1 
ELSE 

factorial := n*f actor i al ( \ n := } n-1 ) 
END; 



BEGIN | Program factorial } 

WRITELN( 'Enter a number:'); 

READLN(x); 

WRITELN(x, '! = ', factorial^ n := j x)) 
END. 



C.4. Program Listing: test^.p 

PROGRAM recurs ivegcd (output); 

} taken from [Jensen 78], page 82 j 

VAR 

x , y , n : i nteger ; 

FUNCTION gcd (m, n : integer): integer; 

} no dec I arat i ons \ 

BEGIN { Function gcd ] 
IF n = THEN 

gcd := m 
ELSE 

gcd := gcd(j m := } n , j n := \ m MOD n) 
END; 

PROCEDURE try (a, b : integer); 

\ no dec I arat i ons j 

BEGIN { Procedure try J 

WRITELN(a, b, gcd(j m := { a, { n := | b)) 
END; 

BEGIN \ Program recursivegcd j 



try(J a 

try(f a 

try(J a 

try(j a 
END. 



MS, I b := } 27); 
| 312, { b := \ 2142); 
j 61 , j b := \ 53); 
| 98, j b := I 868) 
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Appendix D 
Earley's Algorithm 



D.l. Introduction 

Earley's algorithm is a general context-free parsing algorithm. It handles a larger 
class of grammars in linear time than most restricted algorithms. For 

unambiguous grammars it is bounded by n (where n is the number of symbols in 
the input string). In the worst case its time bound is n . 

Earley's algorithm was first described in his Ph. D. Thesis [Earley 68]. It is also 
described in [Aho 72] and (with greater pellucidity) in [Earley 70]. This appendix 
uses the notation from [Earley 70]. An analysis of the efficiency of the algorithm 
can be found in that article. 

D.2. The Recognizer 

A parser must be able to recognize whether an input string is a valid sentence of 
a given grammar. Earley's recognizer scans, from left to right, an input string 
X, ... X of symbols, and is able to look ahead some fixed number k of symbols. 

1 n 

While scanning the input string, the recognizer constructs sets (S.) of states. 
Each of these state sets is initially empty. Each state s in a state set is a 
quadruple of the form 

« = {?, .h f, ">• 

where p is an integer which identifies the production from which the recognizer is 
attempting to derive the current section of the input string (the 
productions of the grammar are numbered for this purpose), 
j is an integer referring to a place within the right hand side of the 
production p (this indicates how much of the production has been 
scanned), 
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f is an integer referring to the position in the input string where the 
recognizer first began to look for this instance of the production p, 
and a is a fc-symbol string which is syntactically allowed to follow this instance 
of the production p. 

It is necessary to ensure that there will always be k symbols for the recognizer 
to see when looking ahead, even when the input string is fully scanned. To 
achieve this, a terminating symbol H is introduced 1 and fc+1 terminating symbols 
are placed at the right end of the input string. 

The recognizer starts by inventing a new production (production 0) 

<l> --> RH 

where <f> is a new non-terminal symbol and R is the root of the grammar (the non- 
terminal which produces a sentence). 
A state s is put into the state set S so that 

* = (0, 0, 0, -r*) 

where H is a string of k terminating symbols. 

For clarity, states will be represented as the pth production with a dot 2 marking 
the position of the pointer j, together with an integer (the value of /) and a 
fc-symbol string (a). So, the state s can be represented as 

4> -> .RH H* 

D.3. The Recognizer's Operations 

The recognizer processes the states in the state set S in order, using only three 
operations: predictor, scanner and completer. These operations are applied to a 
state s in the following ways: 



H is a metasymbol; it does not occur in the grammar. 



2 

Another metasymbol. 
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Predictor 

If there is a non-terminal symbol to the right of the dot in the production, add 
a new state to S for each alternative production of that non-terminal. Each of 
these new states has 

• its dot at the beginning of the production (as none of the symbols of 
the production has yet been scanned) 

• its / assigned to i (the current position in the input string) 

• its a assigned to the k symbols that follow the non-terminal (these are 
determined by reference to the production in s and /or the value of a in 

Scanner 

If there is a terminal symbol to the right of the dot in the production, compare 
that terminal symbol with the symbol X (the next symbol in the input string). 
If they match, add to S. . a copy of s with 

• its dot moved to the right to indicate that the terminal symbol has 
been scanned 

• its / unchanged 

• its a unchanged. 

Completer 

If the dot is at the end of a production, compare a with X t ... X, , , (the next 
k symbols of the input string). If they match, go back to the state set where the 
recognizer first began to look for this instance of the production (ie. S,). Take all 
of the states which could have led to the current production (ie. those states with 
the same non-terminal to the right of the dot as is on the left hand side of the 
production in s). Copy these states from S, into S., modified so that each of the 
new states has 

• its dot moved to the right to indicate that the non-terminal symbol has 
been scanned 

• its / unchanged 

• its a unchanged. 
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Each of these operations is applied in turn to the states in S-, then the 
recognizer processes the states in S. ... If applying all three operations to S. leaves 
S. , empty then the input string is not a valid sentence of the language. This 
means that Earley's algorithm shares the property with some (but not all) other 
parsing algorithms that as soon as a point is reached in the input string such that 
no possible following symbols could make the input string a valid sentence of the 
grammar, the recognizer realizes that the input string is not well-formed. 

If the recognizer ever produces a state set S. consisting only of the state 
-> RH. H* 

then the input string is a valid sentence of the grammar. 

D.4. Application of the Recognizer to an Example Grammar 

Consider the grammar G defined in Figure D-l. 3 

E -> T+E 
E -> T 
T — > F*T 
T --> F 
F --> (E) 
F --> a 

Figure D-l: Definition of the Grammar G 

The terminal symbols of the grammar G are {a,+ , *,(,)}. The non-terminals are 
{E, T, F}. Let the input string (Xj ... Xj be 

(a+a)*a 

Let fe=l, so that the recognizer will only look one symbol ahead when scanning the 
input string. 

As the root of grammar is E, the recognizer puts the following state into S Q 
<j> --> .EH H 

before starting the repeated application of the three operations. 



3 
Example grammar G is taken from the description of Earley's algorithm in [Aho 72] 
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To the right of the dot is a non-terminal symbol, so the predictor is used. The 
predictor adds a new state to S„ for each alternative production of E, namely 



E --> .T+E 





H 


E -> .T 





H 



The dots are at the beginning of the productions because none of the symbols has 
been scanned yet. Each a=^i since a H is to be found after E in the original 
state. The predictor is applied to the two new states. This results in the 
following states being added to S Q 

+ 



T 


— > 


.F*T 





T 


--> 


.F 





T 


— > 


.F*T 





T 


--> 


.F 






The predictor is applied repeatedly to the states in S until all of the newly- 
created states have been processed, at which stage S Q will contain the following 

states 



<f> --> 


.EH 





H 


E --> 


.T+E 





H 


E -> 


.T 





H 


T -> 


.F*T 





+ 


T --> 


.F 





+ 


T --> 


.F*T 





H 


T -> 


.F 





H 


F -> 


■(E) 





* 


F --> 


.a 





* 


F -> 


•(E) 





+ 


F --> 


.a 





+ 


F -> 


•(E) 





H 


F --> 


.a 





H 



The scunner is now applied. As X , = (, the scanner will add to S those states 
in S with a ( to the right of the dot, with each dot moved to the right to 
indicate that the ( has been scanned. S now contains these states 



F -> (.E) 





F -> (.E) 





F -> (.E) 






The predictor is applied to all of the states in S as they all have a non-terminal 
to the right of the dot. Repeated application of the predictor leaves S containing 
the following states 
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F --> 


(.E) 


* 


F --> 


(.E) 


+ 


F --> 


(.E) 


H 


E -> 


.T+E 1 


) 


E -> 


.T 1 


) 


T --> 


.F*T 1 


+ 


T --> 


.F 1 


+ 


T --> 


.F*T 1 


) 


T -> 


.F 1 


) 


F -> 


•(E) 1 


* 


F --> 


.a 1 


* 


F -> 


•(E) 1 


+ 


F --> 


.a 1 


+ 


F -> 


.(E) 1 


) 


F --> 


.a 1 


) 



The scanner can be applied again. X =a, so the scanner will add to S„ every 
state in S with an a to the right of the dot (the dot in the production in each 
new state is moved to the right). S„ now contains the states 



) 



F 


— > a. 


] 


F 


-> a. 


] 


F 


— > a. 


1 



The completer can now be applied for the first time. Each of the states in S„ 
has a dot at the end of its production, but only the second state in S„ has an a 
which matches the lookahead string (as fc=l, the lookahead string is " + " (ie. X 3 )). 
The completer goes back to the state set where the recognizer first began to look 
for this instance of the production (pointed to by /). As /=!, the completer goes 
back to S,. Now the completer adds to S all those states in S that could have 
led to the second production in S„, with the dot moved to the right to indicate 
that the non-terminal (F) has been successfully scanned. So, the completer will 
add the following states to S 



) 



The completer is applied again to the second of these new states as its a matches 
the lookahead string. This step adds to S the following states from S 

E -> T. + E 1 ) 

E -> T. 1 ) 

The completer cannot be applied again to S 2 , so the recognizer continues with the 
application of the scanner to the states in S„. 



T 


--> 


F.*T 


] 


T 


--> 


F. 


1 


T 


--> 


F.*T 


1 


T 


--> 


F. 


1 
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The recognizer will continue in the manner described above until it produces a 
state set 4 which contains only the state 

<j> --> EH. H 

As E is the root of the grammar G, the recognizer has reached the stage where 
the input string 

(a+a)*a 

has been recognized as a valid sentence of the grammar. The complete series of 
state sets for this example appears in Figure D-2. 

D.5. Constructing a Parser from the Recognizer 

To construct a parser, the recognizer must be modified so that it builds a 
derivation tree during the recognition process. This is achieved by building links 
between states when the completer operation is used. (For the purposes of 
building the derivation tree, the values of a can be ignored; lookahead is only 
required for the recognizer.) 

Whenever the completer adds a state to a state set, the parser builds a pointer 
from the non-terminal (before the dot in the new state) to the state which 
triggered the completer operation (which has a production for that non-terminal). 
If the non-terminal is ambiguous then more than one state will cause the completer 
operation to add the same new state. In that case, there will be a set of pointers 
from the non-terminal in the new state (one for each completer operation which 
added that new state). 

When the whole input string has been scanned, the derivation tree for the 
sentence will be attached to the final state 

<i> --> R-i, 

If the sentence that is scanned is ambiguous then all possible derivation trees will 
be attached to the final state. 



The final state set is S . (in this example S„) 
n+] 8' 
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Input string = (a+a)*a 
k=l 







% 








s 


i 






S 2 




X 


,= ( 








x 2 = 


^a 






x 3 =+ 


-> 


.EH 





H 


F 


— > 


(■E) 





* 


F -> 


a. 1 


E -> 


.T+E 





H 


F 


--:> 


(.E) 





+ 


F -> 


a. 1 


E -> 


.T 





H 


F 


— > 


(.E) 





H 


F -> 


a. 1 


T -> 


.F*T 





+ 


E 


-> 


.T+E 




) 


T -> 


F.*T 1 


T --> 


.F 





+ 


E 


--> 


.T 




) 


T -> 


F. 1 


T -> 


.F*T 





H 


T 


--> 


.F*T 






T -> 


F.*T 1 


T -> 


.F 





H 


T 


— > 


.F 




+ 


T -> 


F. 1 


F --> 


■(E) 





* 


T 


— > 


.F*T 




) 


E -> 


T.+E 1 


F --> 


.a 





* 


T 


> 


.F 




) 


E --> 


T. 1 


F -> 


■(E) 





+ 


F 


— r> 


.(E) 




* 






F -> 


.a 





+ 


F 


--> 


.a 




# 






F -> 


•(E) 





H 


F 


-> 


■(E) 




+ 






F -> 


.a 





H 


F 
F 

F 


-> 
-> 
-> 


.a 

.(E) 

.a 




+ 

) 
) 







+ 



+ 







\ 


=a 


E 


> 


T+.E 


] 


E 


— > 


.T+E 


3 


E 


— > 


.T 


3 


T 


-> 


.F*T 


3 


T 


— > 


.F 


3 


T 


— > 


.F*T 


3 


T 


-> 


.F 


3 


F 


^> 


■(E) 


q 


F 


— > 


.a 


3 


F 


--> 


•(E) 


3 


F 


— > 


.a 


3 


F 


--> 


■(E) 


3 


F 


— > 


.a 


3 



+ 



E 
E 
E 
V 
F 
F 



S 4 



F -> a. 

F -> a. 

F -> a. 

T -> F.*T 

T -> F. 

T -> F.*T 

T --> F. 



T 
T. 

T+E 
(E.) 
(E-) 
(E-) 



E 



3 
3 
3 

q 

3 

3 
3 
3 
3 
1 



o 



X 6 


= * 


F -> (E). 
F -> (E). 
F -> (E). 
T — > F.*T 









T -> F. 





T --> F.*T 





T -> F. 






+ 



Figure D-2: State Sets for the Example Input String 



(continued next page) 
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S 6 








s 


•7 








X 


■7 EL 








X 8 


=H 




T 


— > 


F*.T 





+ 


F 


— > 


a. 


6 


* 


T 


--> 
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Figure D-2 continued 



In the example given in §D.4 the completer operation is first applied to the state 

F -> a. 1 

in S-, and the following states are added to SL 

T -> F.*T 1 

T -> F. 1 

The parser builds two pointers (one from the F in each of the new states) to the 
state 

F -> a. 1 



A diagram showing the way in which the parser links the states for the whole 
input string appears in Figure D-3. Although there are several states which are 
pointed to by more than one other state, there is only one derivation tree attached 
to the final state {& --> EH). (If the grammar G had been defined ambiguously in 
that it provided more than one way to parse the input string then the parser 
would have attached to the final state one derivation tree for each alternative 
derivation of the input string.) Following the pointers from the final state, the 
parse tree for the whole sentence can be constructed. The parse tree for the 
example input string is shown in Figure D-4. 
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Input string = (a+a)*a 



s, 



F-->(E) T--> F*T T-> F E->T+E E -> T $ --> E H 

tt_ 



I 

F -> a T --> F*T T -> F E -> T+E E -> T 

tt I 1t 



1 

F -> a T -> FT T -> F E -> T+E E --> T 

It I tt 



r 

F --> a T --> F*T T -> F 

tt I 



Figure D-3: Linked States for the Example Input String 



133 



Input, string = (a+a)*a 
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Figure D-4: Parse Tree for the Example Input String 
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